Introduction
Veterinary surgical skills training has evolved dramatically over the last decade with the widespread adoption of simulation-based training. Simulation-based surgical training offers numerous advantages to training on either cadaver or live, anesthetized animals. Learning surgery on cadavers is suboptimal because of the expense and ethical issues surrounding obtaining, storing, and disposing of the cadavers, and due to cadavers experiencing autolysis, rigor, and lack of perfusion.
1,2 Learning surgery on live, anesthetized animals requires obtaining and caring for the animals, ensuring animal welfare while a novice surgeon is learning a technique they have not previously performed, placing students in a high-stress environment for learning, and in non-survival surgeries, requiring faculty and students to face the emotional and ethical struggle inherent in euthanizing an otherwise healthy animal. Simulation-based veterinary surgical training has proven to be superior to training on cadavers
3,4 or live animals
5 in several studies, while other studies have reported similar learning outcomes.
6,7 Simulation-based training allows students to learn, practice, and be assessed on their surgical skills prior to entering the operating theater, which can replace the need for non-survival surgeries, reduce the number of animals needed for surgical training, and refine students’ skills prior to live animal surgery, fulfilling Russel and Burch’s three Rs of animal use.
8Surgical skills are learned through deliberate practice, defined as repetitive skills practice with rigorous assessment, meaningful and specific feedback, and a progressive increase in difficulty level, resulting in cumulative improvement in skills performance.
9,10 Most veterinary schools begin teaching students to perform surgery on models as a means of facilitating repetitive practice, assessment, and feedback. Research supports creating a surgical skills curriculum consisting of multiple skills practice opportunities beginning early in the veterinary curriculum, with faculty providing supervision and feedback.
11 Research in skills training has demonstrated that distributed or spaced instruction results in better retention of skills than massed instruction.
12–15 However, the optimum distribution of surgical skills training sessions through a veterinary student’s education has not been established. This study sought to compare the short- and long-term learning outcomes between two methods of scheduling clinical skills laboratory sessions teaching the skills to perform an ovariohysterectomy (OVH): condensed into approximately weekly sessions versus distributed over approximately monthly sessions.
Methods
This study was reviewed and deemed exempt by the Institutional Review Board at Lincoln Memorial University (LMU, #829 V.0).
A convenience sample of fourth-semester (second-year) veterinary students (
n = 57) was enrolled during the spring of 2021, and another cohort of fourth-semester students (
n = 45) was enrolled in the spring of 2022. Prior to the study, all students had successfully completed three semesters of clinical skills training, which included surgical skills laboratory sessions in instrument handling, knot tying, and suturing. Each laboratory session was 2–3 hours in duration, for a total of 27 hours of laboratory instruction in surgical skills during the first 3 semesters (
Appendix 1). All laboratory sessions were supervised by faculty at an approximately one-to-eight instructor-to-student ratio. Students’ skills had been assessed in the first 3 semesters using an in-lab assessment of student skills in each surgical skills laboratory session, and on three objective structured clinical examinations (OSCEs), which occurred at the end of semesters 1–3 and contained several surgical skills stations each semester. Students failing to meet minimum expectations (passing 70% of stations) on an OSCE were required to retake and pass a remediation OSCE to progress in the curriculum.
Once enrolled in the study, students participated in their fourth semester of surgical skills training, including learning aseptic techniques and skills to perform an OVH on the LMUterus OVH model (
Figure 1). The model consisted of a wood and polyvinylchloride (PVC) base covered by a three-layer silicone and foam outer covering with a reproductive tract made of long, thin balloons. The use of this model in training and assessing veterinary students has been validated in a previous study, which demonstrated that students’ performance scores on the model were moderately correlated with students’ performance scores and moderately negatively correlated with surgical time on students’ first live canine OVH.
16 This model allowed students to repetitively practice OVH on a single model, which would not be possible on a live animal or cadaver. The model’s replacement reproductive tract costs less than one US dollar. Additionally, the models do not require as much storage space as a cadaver, can be stored on a shelf rather than requiring refrigerator or freezer space, and do not decay or undergo rigor. Students participated in laboratory sessions of 2–3 hours in duration during their fourth semester, for a total of 22 hours of laboratory instruction in surgical skills including 10 hours spent on the OVH model specifically (
Appendix 1). OVH model laboratory sessions required each student to gown and glove, drape their patient, and perform a complete OVH on the model while maintaining asepsis. Students worked in pairs and took turns being the primary surgeon and the assistant surgeon. Students were supervised by the same group of faculty at an approximately one-to-eight instructor-to-student ratio.
Students enrolled in the weekly instruction group during 2021 completed their four model OVH training sessions over an approximately 3-week period and completed their mock OVH assessment during the fourth week. Students enrolled in the 2022 monthly instruction group participated in four model OVH training sessions spread over approximately 13 weeks with the mock OVH assessment during the fourteenth week. Students were sent home with an OVH model and encouraged to practice their skills at home outside of scheduled laboratory time. At each of the OVH instruction labs, students electronically reported how many practice hours they spent on their model in between laboratory sessions using their assigned audience response devices (Turning Technologies, Youngstown, OH) so that this factor could be considered in data analysis. Students also reported the number of hours they practiced between their final laboratory session and their assessment. Missing practice hours were replaced with the class median.
Each student’s skills were assessed twice on a mock OVH model, performed in front of a trained faculty rater. The short-term skills retention assessment took place in the spring, 1 week following the completion of their series of OVH training sessions. The long-term skills retention assessment took place in the fall, 5 months following the initial assessment. Students were required to perform the mock OVH while observing aseptic technique and were scored on a 22-point rubric, previously validated,
16 that scores student performance on each step on a 0- to 3-point scale where an excellent performance is awarded 3 points, a good performance 2 points, a borderline performance 1 point, and a poor performance 0 points (
Appendix 2). The maximum possible score was 66 points. Raters had previously agreed to create pass/fail criteria rather than rely on a cut score, as some individual rubric items, if botched, were considered serious enough to serve as a remediation trigger alone. Students were required to remediate if they made an error that was deemed grievous enough to pose a significant risk of harm or surgical complication to a live patient (e.g., unsafe entry to the abdomen, two or more loose ligatures, inadequate body wall closure, one major but uncorrected breach of asepsis). Students were also required to remediate if they made several smaller errors that, taken together, demonstrated sloppy or insufficiently polished surgical technique. Students were required to pass the first mock OVH assessment, or a remediation assessment, to progress in their curriculum.
All were LMU faculty who were either specialty-trained surgeons or general practice veterinarians who had over 5 years of surgical experience. All raters were taught in the mock OVH teaching laboratory sessions. Raters had been performing the mock OVH assessment for several years, and raters had met several times over the preceding years to discuss how to score student performances using the rubric. A previous study of the OVH model and rubric demonstrated that the same group of raters produced scores with good to excellent internal consistency (Cronbach’s alpha 0.83–0.95) and fair inter-rater reliability (intraclass correlation coefficient 0.43).
16 Students were assessed by a single rater for each skills assessment. Raters used students’ audience response devices to report their scores to prevent errors in data collection and analysis.
Statistical Methods
Students’ total rubric scores on the short- and long-term retention assessments were visually assessed for normality using Q-Q normality plots and approximated a normal distribution. These scores were described using mean and standard deviation and were compared between groups using a two-way mixed ANOVA with one within-subjects factor (time of evaluation) and one between-groups factor (group). The sphericity assumption of the ANOVA was not met, so the Greenhouse-Geisser correction was used. Eta squared was used as a measurement of effect size. Eta squared values of .01 can be considered a small effect size, .06 a medium effect size, and .014 a large effect size. The reliability of rubric scores was assessed using Cronbach’s alpha as a measure of internal consistency. Assessment passing rates were compared between groups using Chi-square. Students’ practice hours outside of laboratory sessions in the spring were visually assessed for normality using a Q-Q plot and found to be not normal in distribution. Student practice hours were described using median and interquartile range and compared between groups using a Mann–Whitney U test. Significance was set at .05, and all analyses were run in SPSS Statistical Package for Social Sciences version 28 (IBM, Armonk, New York).
Discussion
Worldwide, simulation-based training has become an important component of teaching veterinary students to perform surgery. Numerous studies have validated veterinary surgical skills simulators
4,16,18–24 and evaluated how best to teach surgical skills using them,
11,25–29 but the optimal arrangement of surgical skills training sessions within a veterinary curriculum remains ill-defined. The end goal of any surgical training program should be to produce practitioners who are competent long term; initial learning is worth little if it is not retained and applied to subsequent surgeries. Studies have demonstrated that skills requiring accuracy and skills that students initially must work harder to learn are most subject to decay over time.
30,31 Surgical skills fall into this category, and as a result, require practice on a regular basis in order to maintain.
26,32At the short-term retention assessment 1 week after the final training session, students in both groups performed well, with 78%–91% of students receiving passing scores; this offers additional evidence supporting the use of the OVH model in teaching. The weekly instruction group, who practiced their skills in four supervised laboratory sessions at approximately weekly intervals, outperformed students in the monthly instruction group, who practiced their skills in the same four supervised laboratory sessions at approximately monthly intervals. This finding is similar to that of Shea and Morgan, who demonstrated that motor skills retention was initially superior when subjects practiced the task in a massed fashion rather than practicing the task distributed between other tasks but still within the same day.
33 However, condensing training into very short, intense blocks has not always proven to be superior for the short-term retention of skills. A study in human medical surgical education demonstrated equivalence between massed (within 1 day) and distributed (weekly) instruction groups at a short-term retention test.
15 Overall, our findings suggest that if educators are seeking students’ peak surgical performance on a given day (e.g., maximum short-term retention), massed skills training sessions in the preceding week(s) may be better than distributing skills training sessions over several months prior. For example, if students are getting ready to perform their first live surgery, and maximum skill is desired for that surgery, our findings suggest that students should have several practice sessions on an approximately weekly basis leading up to that surgery.
At the retention test 5 months after the final training session, students in the weekly instruction group experienced a significant drop in their skills, unlike the monthly instruction group, which retained their initial performance level. However, because the skill of the weekly group was initially higher than the monthly group, the two groups performed similarly at the retention test. This finding adds to the findings of Moulton et al. who studied surgical residents in human health care being taught microvascular anastomosis in sessions that were either massed within 1 day or delivered on a weekly basis.
15 Moulton et al. reported that the weekly training group performed better than the group taught entirely within 1 day, both upon their retention test 1 month following training and in task transfer to anastomosis in a live rat. Our findings, combined with those of Moulton et al., suggest that for long-term retention of surgical skills, weekly or monthly sessions are superior to scheduling multiple sessions within a single day.
Several educational theories explain why students taught in distributed instructional sessions practice should out-perform those taught in massed sessions. Stimulus Sampling Theory, as proposed by Estes, suggests that spaced sessions are inherently more variable in stimulus or context, and that learning increases due to this variability between sessions.
34,35 Hintzman proposed the inattention theory, which states that when the space between sessions is short, normal processing of information is attenuated.
36 A cluster of study-phase retrieval theories suggest that the act of retrieving the memory of the first training session during the second training session results in more durable learning.
37–39 Subsequently, Benjamin and Tullis added the reminding model, which posits that retrieving a memory after a higher degree of forgetting (i.e., after a longer duration) enhances the memory more than retrieving it after a lower degree of forgetting (i.e., a shorter duration).
40 They propose that if the space between sessions is too long, then learners will forget too much and be unlikely to be reminded adequately of what they once knew, and if the space between sessions is too short, then learners have not forgotten enough, and the act of reminding will not be valuable. Additional research is necessary to quantify for veterinary surgical skills what is the ideal spacing of sessions that results in the optimal degree of forgetting; our research has suggested that weekly sessions and monthly sessions were similarly effective at imparting long-term retention of skills.
Students in the weekly and monthly instruction groups completed a similar number of hours of practice outside of scheduled laboratory sessions. This suggests that regardless of the scheduling of laboratory sessions, students complete a similar amount of practice outside of scheduled sessions to feel that they have reached the level of competence necessary to pass their assessment. In both groups, most of the practice hours were spent shortly before the assessment, as would be expected for students taking any examination or assessment. Veterinary students can be considered elite performers,
41 and they will do what is necessary to reach competence prior to assessment, regardless of the scheduling structure of their training sessions. A total of 78%–91% of students (depending on group) passed their initial mock OVH assessment, indicating that most students will pass without the need for remediation or additional practice. However, after 5 months elapsed, more students dropped below the passing threshold; at that time, only 70%–80% of students passed their assessment. This reiterates previous research that demonstrated that students experience a significant loss of surgical skills when they have a break from training.
26Limitations and Future Research
This study utilized 2 years’ worth of veterinary students, enrolling one cohort in the weekly group and one cohort in the monthly group. While the curriculum was unchanged except for the scheduling of sessions, there is always a chance of minor differences in the teaching between groups in two subsequent years. Subsequently performing a similar study that enrolls students in a crossover design would help to limit this potential source of error. This study enrolled only two groups with laboratory sessions given approximately weekly and approximately monthly, respectively. It is possible that the ideal method of scheduling sessions is neither weekly nor monthly but is something different altogether. Additional studies are necessary to further clarify the ideal distribution of simulation-based veterinary surgical training sessions. Students in this study were evaluated on only one surgical procedure, OVH on a single model; additional research would be necessary to clarify how our findings generalize to other surgical skills and procedures. Assessing global surgical competence in students would ideally include multiple assessments of each student performing numerous surgical procedures. Finally, inter-rater reliability for this group of raters on this assessment, as reported previously, was fair when assessed using the intraclass correlation coefficient.
16 This indicates that there was some degree of error resulting from raters’ judgments about student performances. Obtaining good or excellent inter-rater reliability can be challenging; devoting additional time to rater training may be beneficial in accomplishing this.