Harry Roy
Biology Department
Rensselaer Polytechnic Institute
Troy NY 12180-3590
ABSTRACT
I used the automated testing function of WebCT to administer pre- and post-tests to students in a sophomore Genetics and Evolution course at Rensselaer. This class minimizes lecturing by using a problem-solving approach together with extensive simulation programs and exercises. In one class the gain of student learning, g=(posttest - pretest)/(100-pretest) was .54 +- .32 sd. The high standard deviation reflects a small class size (13 students). A second class, taught to a fairly select group of students, the 6 - year BioMeds, had a gain of .87 +- .09 S.D. A third class, taught to a general group of students in the fall of 1999, showed a gain of .88 +- .21. The combined data for the three classes show a gain of learning of .75 +- .25 S.D. The class has been taught three times a year for several years, with enrollments ranging from 6 to 50. Class averages were negatively correlated with class size (r=-.53). The class averages in these instances were pretty close to expectation based on class size. The gain in learning (g) in basic physics for standard lecture courses is about .2 nationwide. For interactive lecture demonstrations, the gain in learning is about .35 (Cummings et al., 1999). These are based on much larger sample sizes and the standard deviations are much smaller than I obtained this semester. I suggest that administering pre- and post- tests is a reasonable method for assessing the effectiveness of teaching in circumstances where controlled experiments are impractical. The value of g, while it may or may not be directly comparable between disciplines, at least is based on information that the instructor considers relevant to the goals of the course, unlike course polls, which arguably can be susceptible to other influences.
INTRODUCTION
For the last several years I have taught the Genetics and Evolution course at Rensselaer. As I indicated in an earlier paper (Roy, 1996), I was influenced by student polling results, and by the availability of sophisticated simulation programs, and I adopted a studio style of instruction at about the same time this method was being developed in Physics and Mathematics at Rensselaer. This was rewarded by an increase in my course poll results with no detectable drop off in student performance.
I have now taught the course three times a year for several years, so that there is now a fair amount of experience with the course. Over that time I have added considerably to the resources available to the students. The most notable addition is the Visual Genetics software package originally developed for Windows by Alan Day and Robert L. Dean of the University of Western Ontario. I expanded this to the Macintosh platform, and then to the Internet platforms of both Windows and Macs. In view of the decreasing numbers of Macintosh computers on campus, I have made available to the students a Macintosh emulator. This runs as a standalone on Windows computers, so that our Mac-specific software can still be used when the Macintoshes themselves are finally retired, probably some time in the next two years.
Another large addition has been the placement of the course web site on the WebCT server. This has enabled the implementation of a number of functionalities, such as a bulletin board, glossary, text -searching and indexing for the course notes and problem sets, audio recordings, and most recently automated testing. Much of this was added in a short time, but some of it has required longer to develop.
As I reported earlier, I have found personally that in a small studio class, one can really implement the studio method quite easily with good results. Occasionally however, in both small and large studio classes, I have noticed a drop off in student polls, indicating some variability in student response to the way the class is actually taught in a given term. This has quickened my interest in seeking ways to improve what we are doing, and to some extent actually measure this with something more objective than a student poll.
There is difficulty in making such measurements. One is dealing with an inherently variable population of students. One cannot be sure that only one variable is being changed when you try to initiate an educational experiment. Often, one wishes to make more than one change, and even when one wishes to make no change, there is no guarantee that this will continue throughout the term. And, one has limited time for teaching, which prevents carrying out the kind of controlled parallel runs that one is used to seeing in experimental laboratory work.
In preparing for the Spring 1999 edition of this class, I was influenced by a few student comments and by a lecture by Professor Ronald Thornton of Tufts University on interactive lecture demonstrations. On reflection I realized that a number of the sessions that I had taught during the fall 1998 semester had been heavily dominated by lecturing on my part, despite the availability of many modalities of instruction besides that. I also observed that a great deal of each student's grade was based on a relatively small number of tests and projects. The projects, which involved doing laboratory simulations followed by written team reports, I decided to increase in number because too many students do not understand how to write such reports, so their first efforts were often poor. I decided to use the automatic testing and grading mechanism of WebCT to generate more opportunities for the students to earn credit during the term. This utility allows one to ask a number of different types of questions (matching, multiple choice, exact calculation, essay). Thornton's lecture was particularly influential because he showed how, even in a large lecture class, one could engage physics students by giving them something to do before a demonstration, and then asking them to evaluate what actually occurred in the demonstration. Thornton showed videotapes of students reacting to this instructional mode, and the evidence of involvement on the part of the students was impressive. When a question was asked, the students put their heads together and discussed it - the noise level in the room rose steadily - until the point was elucidated and the students went on to the next thing. Cummings et al (1999) reported that a standard measure of student performance on basic questions in physics was used before and after courses taught in three different ways: standard lecture (at many universities); studio physics, and interactive lecture demonstrations. The standard lecture course produced only a fractional gain of .2 (Hake 1998). Surprisingly, studio physics also produced this result, but interactive lecture demonstrations did better (.35). I decided to implement the interactive demonstration strategy within my studio class and deliberately eliminate lecturing as much as possible.
The above changes were not difficult for me to implement. I already had made a number of automatic tests using the WebCT quiz utility. Some of these were based on problem sets that I had on line for the students in the class for several years. I set the quizzes up so that the students could earn 1 course point by doing one of these automated quizzes, for a total of 22. This is the equivalent of giving 22 homework assignments without having to grade any of them personally. I also automated the term exams, but with the difference that the students were told to write their answers on paper, register them on the computer, and then turn in the paper to me. I hand-graded only those questions that the computer judged the students had missed. This saved a lot of time, and I felt good about the issuing of partial credit, since I was overriding the computer in a (small) number of instances. The students also benefited by getting their test results back much faster than usual - within two or three hours of the exam. To assist the students I posted practice exams. To replace lecturing, I made up a series of challenge problems to start off each class, which basically got the students working on problems right away, after which I often explained a few points informally or started them on a laboratory simulation or problem set. These exercises, created in a few minutes before class, were very informal and were not put on line. Over time I expect to increase the number and focus of these exercises.
The pre-test and post-test were written before the course started so that they were deliberately as close to each other in scope and difficulty as I could make them. These were also administered using WebCT, but without the students turning in any written material. A practice test was made that allowed the students to replicate their experience with the pre-test. This may vitiate comparisons of this course with other courses where tests may have been treated more secretively. Part of my philosophy of education is to make clear to students what they are expected to do. Also, of course, no attempt was made to identify or create anything like a model genetics exam that would be comprehensive. Genetics courses vary in length and content around the country and there is no formal standard.
Because course polls were handled by a system new to our campus, with which I was unfamiliar, I did not investigate student reactions to all these changes in any formal way. Anecdotally I can say that I had very few complaints about grading or exams during the semester or after the course was finished. I had reasonable but not perfect attendance - usually 8 to 10 students out of a class of 13. I found myself compelled to lecture to the students very little. I probably delivered less than 10% of the instructional material in lecture mode.
The outcome of these many changes - one could hardly call such a complicated array of changes an experiment - was that the students seemed to do quite well on my usual sort of test. The class average was 85 +- 10% sd. The fractional gain of knowledge (Hake, 1998) of basic genetic concepts, based on pre-test and post-test results, was .54 +- .32. The mean compares well to that reported by Cummings et al. (1999) for both studio physics (.18) and interactive lecture demonstration physics (.35). However, as suggested by the large standard deviation, it is based on a very small sample and remains to be confirmed by more instances. It should be noted that the gain for standard lecture courses in physics, nationwide, is around .2 also. This abysmal result is an indictment of the standard lecture format in physics. I do not know of any comparable data on the teaching of genetics.
The class average was 85%. One can make some judgement about how typical a result this was by comparing it with those obtained in recent editions of the class. The data are shown below in Table I. Class averages ranged from 76 to 93, so the number we got this time was on the high end of this range. Interestingly, when class size was plotted versus class average, we observed a negative correlation of -.53 (Figure 1). The summer editions of the course are populated by a highly select group of pre-medical students. However, when these were pulled from the data set, the correlation was still reasonably strong: -.49 (not shown). So it appears that within the studio method of instruction (loosely defined) there seems to be a decrease in performance level with increased class size. The score of 85 obtained in this class is close to the straight line one would draw through the points in Figure 1. So, even correcting for the class size effect, it appears that the results are quite normal for this course. This suggests that whatever changes I implemented did not affect student performance appreciably.
Because of the low numbers I decided to do the same thing with the next edition of the class, which was taught in the Summer of 1999. Most of the students in this class are a highly select group, admitted provisionally to Albany Medical College and to Rensselaer on the basis of their high school record. I made few changes in the course for this session. I increased the number of quizzes by 1, and I made it possible for the students to take these twice and get the average grade for these. This would tend to push up the class average a little, but this was not a major change likely to affect testing. The pre-test and post-tests were very similar but not identical to those used for the Spring course. I did allow students to turn in written material on the post-test, but the points added by hand were not significant. The results in this class were that the class average was about 89, which falls in a cluster of grades on the plot in Table I, consistent with the small class size. The gain in learning was .87 with a standard deviation of .09.
Although there may be some question about the validity of doing so, I pooled the data from the two runs of the class to calculate an overall gain and standard deviation, which was .71 +- .28 S.D. I have now run the class one more time, and the results remain consistent: this last class showed a gain of learning of .88 +- .21. Overall the value is .75+- .25 for the three classes.
My tentative conclusion is that a large amount of lecturing
is dispensible. Possibly the other changes I made compensated for less
lecturing. Formally it is possible that the results would have been better
if I had not made those changes. I think that the time spent in class by
the students, working on problems and simulations, is just as well used
in that way as by their listening to me talk in a linear fashion. Also,
I find using the automatic grading function of WebCT a great improvement
over hand-grading. It makes the administration and evaluation of pre- and
post- tests easy and efficient. My experience with this was repeatable:
when I did the same thing a second and third time the students presented
a substantial gain in learning. I would think that measuring the
gain in learning is a valuable addition to, or replacement for, the standard
student-opinion polls that are currently used to evaluate teaching.
| Semester | Class Average | Number of Students |
| Summer 99 | 88.1 | 17 |
| Spring 99 | 85.4 | 13 |
| Fall 98 | 76.2 | 50 |
| Summer 98 | 90.29 | 16 |
| Spring 98 | 74.61 | 19 |
| Fall 97 | 82.2 | 55 |
| Summer 97 | 90.51 | 22 |
| Spring 97 | 93.4 | 6 |
| Fall 96 (1) | 80.68 | 25 |
| Fall 96 (2) | 88.9 | 26 |
Table I. Class averages and sizes for Genetics and Evolution, Fall 1996 -Summer 1999.
Figure 1. Correlation of Grades and Class Size for Genetics and Evolution Fall 1996 to Summer 1999.
These are negatively correlated with a correlation coefficient of -0.53 However the two of the small classes were populated by a highly select group of pre-medical students, taking the class together during the summer. When the summer students are taken out of the analysis, the correlation coefficient of class size vs grades is -0.49.
References
Evaluating Innovation in Studio Physics, K. Cummings, J. Marx, R. Thornton, D. Kuhl, American Journal of Physics, Supplement 1 to Vol. 67, No. 7, pp S38-S45 (1999).
Hake, R.R. (1998) Interactive-engagement versus traditional methods: a six -thousand - student survey of mechanics test data for introductory physics courses. Am. J. Phys. 66 64-74.
Roy, Harry (1996) Teaching Studio Genetics and Evolution (http://www.rpi.edu/dept/bio/info/Biosimlab/genetics.html)