Two-Stage Exams in Computer Science

Principal Investigator:

Dr. Celine Latulipe, Professor, Department of Computer Science, University of Manitoba, Canada (celine.latulipe@umanitoba.ca)

Co-Investigators:

Dr. John Anvik, Associate Professor in the Department of Mathematics and Computer Science, University of Lethbridge, Canada

Kevin Lin, Assistant Teaching Professor, Paul G. Allen School of Computer Science and Engineering, University of Washington, USA

Sabrin Nowrin, Doctoral Candidate, College of Computing and Informatics, University of North Carolina at Charlotte, USA

Dr. Brian P. Railing, Associate Teaching Professor, Computer Science Department, Carnegie Mellon University, USA

Dr. Scott J. Reckinger, Clinical Assistant Professor, Department of Computer Science, University of Illinois Chicago, USA

Dr. Armita Zarnegar, Lecturer, School of Science, Computing and Emerging Technologies, Swinburne University of Technology, Australia

This research aims to understand how classroom environments in Computer Science impact student performance on and perception of two-stage exams (TSEs). A two-stage exam is an exam format where students write an exam individually and rewrite the same exam in collaboration with a group of peers. This study involved researchers from multiple institutions holding two-stage exams in post-secondary Computer Science courses and collecting data which was then anonymized and merged together into a large dataset stored at the University of Manitoba. By collecting data from a variety of post-secondary institutions, we aimed to identify correlations between course environment (collected using a published metric: the Collaborative Active Learning Inventory) and students’ perceptions and performance on TSEs. Students writing exams in the courses that participate were given surveys upon completion of TSEs which included questions regarding their group’s dynamics and perceptions of participating in a TSE, as well as limited demographic questions. Students also had an opportunity to allow their grades to be used for research analysis (their grade going into the exam, their individual grade and their group grade). Instructors of courses that participated also provided information about their classroom environment and their TSE setup through a course information survey, as well as provided data about their experiences running
and observing TSEs through a short post-exam reflection survey. Researchers from other institutions were invited to join the study through the Association for Computing Machinery (ACM) Conference on Innovation and Technology in Computer Science Education (ITiCSE). The ITiCSE conference has a ‘working group’ track, and I submitted a working group proposal for this research, which was accepted. Researchers interested in participating in our working group applied through the conference website and then had to get ethics approval at their own institution (using my protocol as a template), before sharing their data with us. This website provides details about ITiCSE working groups: https://iticse.acm.org/2025/call-for-working-groups/

Results

We conducted quantitative analysis of student perception and performance data and synthesized results with qualitative analysis of student and instructor perceptions.Our mixed methods approach enabled us to identify several important factors influencing student perceptions of and performance on TSEs in Computer Science classes, including group formation method and group size during the TSE. Student survey responses indicated that students generally had very positive perceptions aboutTSEs, and a majority of students felt better about their performance on the individual stage after completing the group stage. Some instructors experimented with offering a different exam during the group stage: students who completed the same exam during both stages of the TSE reported more positive perceptions about the experience than students in courses where the group stage exam differed from the individual stage exam. TSE grade data com-pared with students’ grades going into the assessment show that while the raw group-vs-individual grade differentials are highest for low-performing students, when we consider the relative grade improvement, it is similar across the performance spectrum. In-structors were positive about the experience of running TSEs and all reported intentions to continue using the assessments in at least some courses.

The full working group report is available here: https://dl.acm.org/doi/epdf/10.1145/3760545.3783964

Share this: