Research Article | Open Access
Using Precourse Formative Written Testing in a Pharmacology Class Greatly Increases Medical Students’ Performance in Final Written Summative Tests
We wanted to test the progress of medical students at our university in a pharmacology course. The formal teaching was given as lectures to the full class of students. We gave the very same written test of multiple-choice (MC) questions (single best choice) to third-year medical students before and after a one semester course of basic pharmacology. The initial voluntary test (containing 30 MC questions) was taken by 79% of the eligible students (n = 147), a week before pharmacology lectures had started. Defining a passing grade of 60% of right answers, only 2% of the students passed the test. The range was between 5 and 21 points. The final, now obligatory, written test at the end of the course (one week after the last lecture in pharmacology) was taken by all students in the semester (n = 179) and was passed by 95%, of students, again defined by the same passing score. Here, the points obtained ranged from 12 to 29. Over the time of the semester, the attendance in the lectures dropped dramatically to less than 10% of the students. Hence, progress tests are useful, but they hardly measure the gain in knowledge through attendance in the pharmacology lecture (the intervention); they also measure other sources of knowledge, such as textbook reading or memorizing only the initial questions and looking up the answers.
Assessing a gain in knowledge in education is a continuous task . One such possibility lies in using various forms of progress tests. This has been done in various ways in medical education in numerous medical schools and different countries worldwide. Usually multiple-choice (=MC) tests and not oral test are currently given, for that purpose. The advantages of MC tests compared to oral examinations are that they can be made highly reliable and objective, and they can be standardized to test large classes in a short time and test a broad knowledge. Finally, they are cost-effective because a computer can grade and evaluate the tests, providing information such as difficulty indices, discrimination indices, reliabilities, strength of distractors, test discrimination, and other psychometrical parameters [2, 3]. Usually, only one MC test is given at the end of a semester for teaching a certain subject, for instance, after the student, if they chose so, could have attended a lecture in basic pharmacology. From the scores of this final MC test, it is not necessarily obvious how much of the measured knowledge in that test results from the lecture or previous knowledge. However, we wanted to know how much prior knowledge (at the beginning of the semester) or work outside lecture attendance accounted for the success in the end of semester examination. We hypothesized that increase of knowledge was solely due to the lecture during the semester.
2. Related Studies
Others tried to assess gain of knowledge in a medical education by giving identical tests before, during, or after completion of a curriculum [4, 5]. In other contexts, repeated testing without studying was more useful for knowledge retention than studying without testing . Testing prior to learning may have advantages, such as motivating a student to prepare before attending lectures or course, and might make students aware of their special knowledge gaps (discussed in ). Even unsuccessful retrieval of knowledge in pretesting may facilitate subsequent learning from lectures [8, 11].
In order to exclude the possibility that students simply learned the right answers from the pretest by heart and thus passed the end of semester examination, we did not give out the questions of the pretest to students. Moreover, one left students unaware that the second test would offer the same questions as the pretest. In addition, we included a control test group of students to whom the same questions were given in the final exam but without the possibility of seeing the questions in a pretest.
Our research hypothesis was that giving the same MC test twice (a pretest before the teaching period and a final test after the teaching period) would be a proper way to assess the success of pharmacology lectures given to medical students, at least for those students that took part in the lectures.
3.1. Research Methods
An initial voluntary written test (pretest) to show academic achievement in (in this case general) pharmacology contained 30 multiple-choice (=MC) questions with a passing grade of 60% (see Figure 1). The MC questions reflected the learning objectives in subsequent lectures. Typically, two questions for each lecture were constructed. The content encompassed basal and systematic pharmacology such as pharmacodynamics, pharmacokinetics, autonomic pharmacology, antiarrhythmic drugs, drugs that lower blood pressure, and antibiotics. A reward of merit-based bonus points in the final examination motivated students to take the initial test to improve their grades. In past years, we gave pretests (different questions than in the present study) without offering incentives like bonus points, but we had low participation rates (10–20% of eligible students) and poor performance: students later told us they did not take the exam seriously and answered many questions randomly .
The summative test at the end of the course contained the same MC questions as the pretest (Figure 1). Students sat in a lecture hall, separated from each other by empty seats, and supervised by teachers. Four versions of the written tests were prepared that differed only in the order of the questions and answers.
Students who previously took the same required test after a pharmacology course, but without a pretest, served as a control group (Figure 1). In cohort 1 (control group), the baseline consisted of 219 students. In cohort 2 (study group), 147 students took the voluntary pretest, and 179 students took the pretest as well as the required test. Cohort 3 included 37 students who would have been eligible to take the pretest but chose to sit for only the final test. In contrast, only five students participated in the pretest but did not take the final exam.
The attendance of students in lectures (one lecture per week with a total of 11 lectures) in both groups of students was monitored: students were asked to fill in a paper attendance sheet.
3.2. Data Analysis
Arithmetic mean values and standard error of the mean (SEM) were calculated using Excel 2010. Correlations (Spearman correlation) and parametric or nonparametric tests were established using SPSS 25 . A probability value ( value) less than 0.05 was regarded as significant.
3.3. Data Availability
Interested readers can obtain all data (original data and statistical analysis) in electronic format from any of the authors.
3.4. Evaluation Results
In order to establish a baseline distribution function of test results after taking the basic pharmacology course (Figure 1, upper lane), a written, obligatory examination was given to cohort 1 of medical students. In cohort 1 (control group), 219 of 227 possibly participating students sat for the exam (96% of students). The mean points (arithmetic mean and SEM) reached were 20.29 ± 0.27, of which 70 were male and 149 female students. Both genders reached similar points, namely, 19.9 ± 0.46 and 20.4 ± 0.42 points (). In this control group, there was no correlation (according to Spearman) between attendance at the lecture and the points achieved in the final examination after the lecture (Figure 2, ).
In the voluntary pretest (Figure 3), in the subsequent group of new students (cohort 2 pretest), 147 of 184 possible students participated (80% of students in this cohort). The mean points reached amounted to 11.5 ± 0.23, of which 65 were male and 82 were female, who again reached similar points, namely, 11.9 ± 0.38 and 11.1 ± 0.28 points (). The distribution of points is depicted in Figure 3. Taking 60% as a passing grade (18 points), only three of the total of 147 who sat for the examination, would have passed (2%). The lowest grade was 5 points (a student who subsequently did not take the final exam). The range of points reached was between 5 and 21 points. This indicates practically no knowledge of basic pharmacology in these students, which was to be expected, as they were exposed only in the following weeks to the lectures in basic pharmacology. The same exam was again given to these students (Figure 3) (cohort 2 pretest + obligatory test) and now the students reached 25.8 ± 0.31 points (Figure 4). Using Student’s t-test, the mean test scores were better in the test after the lecture (final exam) compared to the entrance exam (Figure 3 versus Figure 4, ). Of all participants in cohort 2 (pretest + obligatory test), 62 students were male and 80 students were female, which again reached comparable results (25.4 ± 0.49 and 26.2 ± 0.41 points, respectively, ). Taking 60% as a passing grade (18 points), as many as 136 of 142 participants would have passed (95%). The lowest grade was 12 points (one student), and the highest grade was 29 points (21 students). Also in this study cohort, there was no significant correlation (according to Spearman) between attendance at the lecture and the points achieved in the final examination after the lecture (Figure 4, ). The mean points reached in Figure 4 were higher than in Figure 2 (Mann–Whitney, ). Moreover, the points reached in Figure 4 were higher than in Figure 3 (Mann–Whitney, ).
Students (cohort 3 only, obligatory test) who had not participated in the pretest (20% of those taking the obligatory test) reached 25.3 ± 0.55 mean points (Figure 5) of which 10 were male and 27 were female, reaching similar points of 23.5 ± 1.39 and 26.04 ± 0.52 points. Taking 60% as passing grade (18 points), as many as 36 of 37 participants would have passed (97%). The lowest grade was 15 points (1 student), and the highest grade was 29 points (3 students).
Interestingly, one student deteriorated from 13 to 12 points from the first (pretest) to the second examination (obligatory final test). In contrast, the highest improvement (one student) was from 6 to 29 points. Three students improved from 8 points to 29 points, and one student exhibited the poorest improvement, from 7 to 13 points. The final, obligatory test at the end of the course lectures was taken by all possibly participating students (n = 179) and was passed by 94.97% (range of points obtained was 12 to 29).
This might be interpreted as gain of knowledge by the course, but also (judged from informal talks with students) due to memorization of the questions (which however were never formally released) by students.
The difference in final exam points for students who took the initial exam and those who did not, is of interest. These groups are separately plotted as Figures 5 and 6. A one-sided t-test gave a value of 0.032, indicating significance. Moreover, no gender differences were apparent, which is reassuring (data not shown). In Figure 7, the distribution functions of each group (cohorts 1, 2, and 3) are combined in order to facilitate comparison between groups.
In the obligatory exam in clinical pharmacology (which was taught in the sixth and seventh semesters to the same class of medical students, see Figure 1), given at the end of the seventh semester (213 participants = cohort 4), we had the chance to follow up the 147 students of cohort 2 (pretest + obligatory final exam) from the initial fifth semester. Besides the 147 students, 27 students (of originally 37 students) were included who had participated in only the written, obligatory test (final exam: Figure 6, cohort 3). In the subgroup of cohort 4 (147 cohort 2 students), the mean points obtained amounted to 16.02 ± 0.278. Male students and female students reached similar points, namely, 16.54 ± 0.426 and 16.99 ± 0.368 points, respectively. The range was between 7 and 25 points.
Taking 60% as the passing grade (18 points) in this obligatory exam at the end of the seventh semester, only 74 from 213 would have passed (34.74%). Taking 60% as a passing grade only in our subgroup of cohort 4, just 47 (19 male and 28 female) from 147 students would have passed (31.97%). The following mean points were reached: 16.15 ± 0.287, of which 54 were male and 85 were female students who reached similar points, 16.69 ± 0.425 and 17.01 ± 0.391 points. The range was between 8 and 24 points. As mentioned above, we were able to follow up 27 students in the seventh semester of the 37 students who have written only the obligatory test (final test: Figure 6, cohort 3) in the fifth semester. These 27 students have reached the following mean points: 15.89 ± 0.820, of which 8 were male and 19 were female, who reached similar points, 14.5 ± 1.647 and 15.89 ± 0.951 points. The range was between 7 and 25 points. In Figure 8, the sequence of the study steps is reproduced and the percentages of students who passed and failed in the study arms (cohorts) together with the corresponding number of students are given for each cohort. The pretest group (cohort 2) is highlighted by grey background, and the obligatory exams are represented by dashed rectangles to facilitate the allocation of students to the study groups (Figure 8).
Moreover, we tried to correlate the findings in the exams in basic pharmacology (fifth semester) with the results of students’ final exam (board exam, Germany-wide, written, MC, comprising all the clinical medicine topics, including basic and clinical pharmacology: = M2 exam). We obtained data from 114 students. Students took the M2 exam in April 2016, when up to 319 points could be obtained, or in October 2016, when up to 317 points were available. Among 96 students (the range was between 210 and 295) who took the pretest and the final exam, 41 male students obtained mean points of 257.17 ± 3.054 and 55 female students obtained 256.29 ± 2.780 points. There was a significant correlation between the points in the final exam in the introductory pharmacology course, the subsequent clinical pharmacology course (Spearman correlation, ), and the final state exam (called “M2-exam” in Germany, Spearman correlation, ).
Besides using MC tests for summative exams, many medical faculties also use MC questions for formative exams. Successful learning can be understood as observable changes if the learners’ behavior originates from external conditions . Interestingly, retrieval of knowledge can affect later retention. Retention of knowledge is better if knowledge is tested at all, compared to groups who have not been sitting for any exam (testing effect: for review, see [13, 14]). For instance, in an eighth-grade science classroom in the USA, better scores were reached in the final exam of the course when the topics had been tested before: 92% of the previously quizzed MC questions were answered correctly, compared to MC questions not previously tested . However, this study was not on medical students; the examination was online and thus might be subject to manipulation (students might have texted the right answers among them).
However, one might use tests to enhance retention of important clinical facts, in our context clinically important drugs, e.g., their indications, contraindications, and relevant pharmacokinetics parameters. While the testing effect has been clearly demonstrated in an artificial psychological laboratory setting, it is critical to know whether this testing effect is also present in a current medical curriculum in this study in pharmacology for medical students. It has been argued that in real life, medical students also learn outside the classrooms (e.g., during ward rounds and their clerkships), they are exposed to pharmacological knowledge in other lectures and courses (internal medicine, dermatology, etc.), and they do homework on their own or in groups, and get reading assignments or at least suggested papers or textbook chapters in pharmacology (compare ).
A well-established way to assess progress in knowledge acquisition is to use progress tests (usually in electronic form [17, 18], like the multicentric tests in the Netherlands  and in Germany ). Some authors concluded that progress tests might be a useful for early identification of students who may need special attention, and progress tests might be a useful tool for self-learners [21, 22]. In these progress tests, in contrast to our study, typical final-year board examination questions are continuously given throughout the semesters during medical school. All specialties of clinical medicine are tested, a large pool of questions (question bank) is available, and no question is asked twice.
Others have given identical questions repeatedly to assess competency in clinical examination but not in pharmacology . These colleagues tested 32 students twice a year with the same 47 MC questions to assess the maintenance of gains in learning in pharmacy students (but not medical students) on pharmacotherapeutics . Moreover, their main goal was to compare team-based learning versus lectures .
A study similar to ours but in a different environment was recently published by colleagues in Canada . They tested whether previous (online) MC tests enhanced knowledge retention for subsequent workshops (the didactic intervention) for pediatricians . Their control groups did not receive a previous MC test. After the workshops, both groups were given (online) the same MC questions . It turned out that retention was better (measured as performance in the MC test after the workshop) if a pretest was done . This is an encouraging similarity to our results. However, they tested certified pediatricians; hence, some previous knowledge is to be expected (Figure 2 in . Moreover, the more motivated pediatricians were going to workshops; hence, there might have been a selection bias of participants (only 186 of 308 participants, 62%, were willing to take part in the study ). Therefore, their results are certainly better than in less-motivated participants (which might include our students). In contrast to our study, only five MC tests were given in the pretest and five different MC tests were given in the knowledge test period .
It might be gratifying to note that students who had sat for the pretest performed better in the final test (Figure 4) than students who did not sit for a pretest (Figure 2). However, this interpretation is clearly not fully supported by the data, as students who did not sit for the pretest in the second semester nevertheless performed better (Figure 6) than students in the previous semester (Figure 2). Hence, one could simply conclude that students who prefer to study on their own do not gain much from a pretest (comparing Figures 4 and 6).
We would like to make the point that the present study, with quite a number of participants (147–219 students per semester) is at odds with other studies with lower numbers of participants, where identical tests were given twice and an improvement in mean points was regarded as proof of the efficacy of the teaching intervention. For example, clinical students in an intensive care rotation were given the same questions initially and four weeks later, the 32 participants experienced an increase in exam points from baseline (65.7) by 4.6 points .
One can ask how we know that the control group was a valid control group and not simply a cohort of generally poorer performing students. One could argue that without randomly assigning students to experimental and control groups, it would be necessary to confirm in some other fashion that the control group matches the experimental group on all relevant background variables. This is admittedly a limitation of our study. However, we noted that the control group in the written test after the course in clinical pharmacology (end of seventh semester) obtained mean scores that were not statistically different than of the study cohort. This argues against the assumption that generally an academically weaker student group was used here as the control cohort compared to the study cohort. Furthermore, one can ask why lecture attendance was uncorrelated with final exam. This is admittedly surprising for us: we had anticipated a strong positive correlation. However, many colleagues in several countries privately mentioned similar findings: attendance of medical students in lectures (where they are not forced to participate in most universities worldwide) sharply declines over time. Students usually explain this by competing time needs like learning for other forthcoming examinations.
Moreover, one can argue, since there was very little difference in performance among those who took vs. those who did not take the pretest on the final summative test performance, what were the benefits of administering the pretest. This clearly questions the usefulness of the pretest. One way to address this issue might be to assess in a subsequent study in an additional questionnaire whether or not students found the pretest subjectively helpful (for better understanding the lectures, the textbook or preparation for subsequent test). If a strong desire of students was reported to retain this pretest that should merit consideration, as student satisfaction plays a role in curriculum development, in most faculties. Otherwise, we would not use a pretest again as it binds resources.
4.1. Future Work
In the future, for reasons of lower demands on our resources, we intend to use the basic format of this study for online tests as pretests. It will be interesting to see whether this will lead to worse, similar, or better results in the final written exams than written pretests. Moreover, if one would repeat the present investigation, it would be informative to find out which other sources of information students under our testing conditions really use. One could offer an open questionnaire on learning tools and habits and correlate these learning habits to the final test: one would then use the pretest results as a contributing factor to the final test results.
In summary, giving the same MC questions twice to test an intervention in between has probably overestimated the impact of the intervention on the gain of knowledge. To the best of our knowledge, this is the first study of this kind in medical students in pharmacology.
Progress tests, consisting of a pretest and a final test, are useful to measure gain in knowledge in medical students, but they hardly measure alone the gain in knowledge through attendance in, e.g., a basic pharmacology lecture (the intervention); they also measure other sources of new knowledge, such as textbook reading or memorizing only the initial questions.
All original data are available in electronic form.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
J. N. designed the research. S. S. and U. G. performed research. S. S., J. N., and U. G. analyzed data. U. G. and J. N. wrote the paper.
The authors acknowledge the support of PD Dr. Alp Aslan (Institute for Psychology, University Halle) with the design, statistical tests, and interpretation of the study. The authors thank the state board of medical examiners (Landesprüfungsamt Halle), especially Frau Roscher, for making data available to us. The authors acknowledge the financial support within the funding program Open Access Publishing by the German Research Foundation (DFG). The work did not receive any external funding. All internal funding was through the state-owned Martin Luther University Halle-Wittenberg.
- A. Krouska, C. Troussas, M. Virvou, and C. K. Fragkakis, “Applying skinnerian conditioning for shaping skill performance in online tutoring of programming languages,” in Proceedings of the 9th International Conference on Information, Intelligence, Systems and Applications (IISA), pp. 1–5, Zakynthos, Greece, July 2018.
- D. Bauer, M. Holzer, V. Kopp, and M. R. Fischer, “Pick-N multiple choice-exams: a comparison of scoring algorithms,” Advances in Health Sciences Education, vol. 16, no. 2, pp. 211–221, 2011.
- L. W. T. Schuwirth, D. E. Blackmore, E. Mom, F. Van Den Wildenberg, H. E. J. H. Stoffers, and C. P. M. van der Vleuten, “How to write short cases for assessing problem-solving skills,” Medical Teacher, vol. 21, no. 2, pp. 144–150, 1999.
- B. Bleske, T. Remington, T. Wells, K. Klein, J. Tingen, and M. Dorsch, “A randomized crossover comparison between team-based learning and lecture format on long-term learning outcomes,” Pharmacy, vol. 6, no. 3, p. 81, 2018.
- R. G. Williams, D. Klamen, T. Clark, S. T. Hingle, G. M. Rull, and J. Daniels, “Physical findings progress test at a medical school—longitudinal data analysis,” in Proceedings of the Association for Medical Education in Europe, vol. 177, p. 4J1, Basel, Switzerland, 2018.
- D. P. Larsen, A. C. Butler, and H. L. Roediger III, “Repeated testing improves long-term retention relative to repeated study: a randomised controlled trial,” Medical Education, vol. 43, no. 12, pp. 1174–1181, 2009.
- M. Feldman, O. Fernando, M. Wan, M. A. Martimianakis, and K. Kulasegaram, “Testing test-enhanced continuing medical education,” Academic Medicine, vol. 93, no. 11S, pp. S30–S36, 2018.
- L. E. Grzeskowiak, A. E. Thomas, J. To, E. Reeve, and A. J. Phillips, “Enhancing continuing education activities using audience response systems: a single-blind controlled trial,” Journal of Continuing Education in the Health Professions, vol. 35, no. 1, pp. 38–45, 2015.
- L. E. Richland, N. Kornell, and L. S. Kao, “The pretesting effect: do unsuccessful retrieval attempts enhance learning?” Journal of Experimental Psychology: Applied, vol. 15, no. 3, pp. 243–257, 2009.
- A. Melzer, U. Gergs, J. Lukas, and J. Neumann, “Rating Scale Measures in Multiple-Choice Exams: Pilot Studies in Pharmacology,” Education Research International, vol. 2018, Article ID 8615746, 12 pages, 2018.
- A. Field, Discovering Statistics Using IBM SPSS Statistics, SAGE edge, London, UK, 2018.
- A. Krouska, C. Troussas, and M. Virvou, “Computerized adaptive assessment using accumulative learning activities based on revised bloom’s taxonomy,” in Knowledge-Based Software Engineering: 2018. JCKBSE 2018. Smart Innovation, Systems and Technologies, M. Virvou, F. Kumeno, and K. Oikonomou, Eds., vol. 108, pp. 250–258, Springer, Cham, Switzerland, 2019.
- H. L. Roediger and J. D. Karpicke, “The power of testing memory: basic research and implications for educational practice,” Perspectives on Psychological Science, vol. 1, no. 3, pp. 181–210, 2006.
- H. L. Roediger and J. D. Karpicke, “Test-enhanced learning,” Psychological Science, vol. 17, no. 3, pp. 249–255, 2006.
- M. A. McDaniel, K. M. Wildman, and J. L. Anderson, “Using quizzes to enhance summative-assessment performance in a web-based class: an experimental study,” Journal of Applied Research in Memory and Cognition, vol. 1, no. 1, pp. 18–26, 2012.
- K. B. McDermott, P. K. Agarwal, L. D’Antonio, H. L. Roediger, and M. A. McDaniel, “Both multiple-choice and short-answer quizzes enhance later exam performance in middle and high school classes,” Journal of Experimental Psychology: Applied, vol. 20, no. 1, pp. 3–21, 2014.
- C. Bremers, A. Krouska, and M. Virvou, “Using a multi module model for learning analytics to predict learners’ cognitive states and provide tailored learning pathways and assessment,” in Machine Learning Paradigms. Intelligent Systems Reference Library, M. Virvou, E. Alepis, G. Tsihrintzis, and L. Jain, Eds., vol. 158, pp. 9–22, Springer, Cham, Switzerland, 2019.
- C. Troussas, A. Krouska, and M. Virvou, “MACE: mobile artificial conversational entity for adapting domain knowledge and generating personalized advice,” International Journal on Artificial Intelligence Tools, vol. 28, no. 04, Article ID 1940005, 2019.
- R. A. Tio, B. Schutte, A. A. Meiboom et al., “The progress test of medicine: the Dutch experience,” Perspectives on Medical Education, vol. 5, no. 1, pp. 51–55, 2016.
- J. Arias, H. Schenkat, S. Finsterer, and M. Simon, “Students’ mentoring based on a structured selection using combined summative course and formative progress test results–a longitudinal view of students’ performance,” in Proceedings of the Association for Medical Education in Europe, Helsinki, Finland, 2017.
- R. Gagnon and C. Bourdy, “A progress test to identify medical students with potential learning difficulties and to predict scores on the Canadian certification exam,” in Proceedings of the Association for Medical Education in Europe 2016, vol. 262, Barcelona, Spain, 2016.
- A. Krouska, C. Troussas, and M. Virvou, “A literature review of Social Networking- based Learning Systems using a novel ISO-based framework,” Intelligent Decision Technologies, vol. 13, no. 1, pp. 23–39, 2019.
- D. Piquette, R. Brydges, A. Goffi, C. Lee, B. Mema, and C. Walsh, “Assessing competency of subspecialty residents in critical care clinical reasoning: validity evidence in support of the script concordance test,” in Proceedings of the Association for Medical Education in Europe, vol. 177, p. 3I7, Basel, Switzerland, 2018.
Copyright © 2020 Joachim Neumann et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.