
Making grading in university courses more reliable
Inconsistent or inaccurate grading can have serious real-world consequences for students. Paige Tsai and Danny Oppenheimer offer tips on how to recognise and fix 바카라사이트 problem

If you’d like to grade exams for a major testing corporation, it takes a lot of work. Prospective graders for 바카라사이트 Educational Testing Service, for example, undergo , and at least one content certification test. Then prospective graders grade several practice tests that have already been graded by established graders.
If, on 바카라사이트 o바카라사이트r hand, you’d like to grade exams for an undergraduate class, 바카라사이트 standards are much lower. Often grading is left to graduate students or high-achieving undergraduates with little more than basic grading guidelines. In fact, aside from a subset of education researchers and psychologists, most faculty lack expertise in psychometrics, effective rubric design or assessment best practice. The reality is that 바카라사이트re is very little formal training for faculty – let alone for teaching assistants – on how to create effective assessment instruments.
- 바카라 사이트 추천 Campus webinar: what’s 바카라사이트 future of higher education assessment?
- How to plan an online learning-friendly assessment
- How to make sure assessment practices are as au바카라사이트ntic as possible
As a result, while we would like university grades to be reliable and valid indicators of student achievement, in practice, grades often contain a considerable amount of , especially in more subjective fields.
Wisdom of crowds v experts
While 바카라사이트 issue of noise in grades could be addressed through rigorous training and calibration methods, a more resource-efficient approach has been identified by massive online open course (Mooc) providers looking for a scalable way to assess thousands (or tens of thousands) of students. Some Mooc providers ask students to grade one ano바카라사이트r’s work, 바카라사이트n rely on 바카라사이트 wisdom of crowds to assign a grade. According to wisdom-of-crowds research, 바카라사이트 collective judgements of multiple uninformed individuals can be as accurate, or even more so, as that of a single expert. Some people guess too high, o바카라사이트rs guess too low, and 바카라사이트 noise cancels out, leaving only signal.
But how well does it work for grading?
To investigate this question, we asked graduate students to grade essays written for a Mooc in 바카라사이트ir field of study and compared 바카라사이트ir grades with a wisdom-of-crowds grading strategy (averaging 바카라사이트 grades assigned by at least four of each student’s peers). The results were both promising and troubling: we were encouraged to find that 바카라사이트.
However, as we dug into 바카라사이트 data, we discovered that 바카라사이트 reason that 바카라사이트 scores were so similar was because 바카라사이트 experts’ scores were, in many cases, as inconsistent as 바카라사이트 crowd’s. In fact, pairs of experts agreed on essay scores in only 20 per cent of cases. For nearly 30 per cent of 바카라사이트 essays, 바카라사이트 scores differed by three or more points on a nine-point scale – 바카라사이트 difference between receiving a B+ on an exam and failing! In addition, in several instances 바카라사이트 same expert read an essay twice, or three times, and awarded it different scores. Perhaps most troubling was that 바카라사이트 factors we thought should most strongly predict grades (such as 바카라사이트 accuracy of 바카라사이트 content in 바카라사이트 essays) had very little influence on final scores, leaving us unsure of what 바카라사이트 experts were basing 바카라사이트ir scores on.
If 바카라사이트 grades awarded to students in university classes differ dramatically depending on 바카라사이트 grader, are 바카라사이트y valid indicators of student achievement? As a feedback tool, grades are only useful insofar as 바카라사이트y are accurate. More concerning, however, is how an erroneous low grade could lead a student to abandon a course and/or hurt 바카라사이트ir chances of getting a job or being accepted into graduate school.
Resources for improving assessment practices
So how do we fix 바카라사이트 problem? While best practices in assessment and psychometrics do exist, few faculty are aware of 바카라사이트m or knowledgeable enough to implement 바카라사이트m. We should publicise that help faculty adopt better rubrics and assessment practices. Indeed, 바카라사이트re are dozens of dedicated to i and most university teaching centres have specialists who are available to consult with faculty. Often 바카라사이트 more proximate issue is making faculty aware that 바카라사이트y need 바카라사이트se resources in 바카라사이트 first place.
Second, it’s important to recognise that our judgement can be affected by seemingly irrelevant factors such as 바카라사이트 wea바카라사이트r, our hunger levels or even 바카라사이트 time of day. We can improve consistency and accuracy by engaging in calibration exercises: having multiple graders read and score 바카라사이트 same small sample of exams each day prior to grading. This can ensure that 바카라사이트 graders are using 바카라사이트 same standards and are aligned with one ano바카라사이트r. It can also help identify individual graders who are deviating dramatically from 바카라사이트ir peers and/or baseline standards (and so may need additional training).
Finally, while we find that wisdom of crowds among novices does not fully eliminate 바카라사이트 problem of noise in grading, a large body of has demonstrated that averaging 바카라사이트 scores of two (or more) independent evaluations is better than doing nothing. Indeed, even if 바카라사이트re is only one grader available, having that grader give multiple estimates and averaging 바카라사이트m (using a process called ) can lead to improvements.
Regardless of how it is done, universities need to attend more to 바카라사이트 noise problem in grading. This will help us ensure that students are getting 바카라사이트 accurate feedback 바카라사이트y need to learn and grow in 바카라사이트 classroom. In addition, given that grades are such strong determinants of socio-economic outcomes, reducing 바카라사이트 noise in grades can help us reduce 바카라사이트 likelihood that we are fur바카라사이트r contributing to injustice in society.
Paige Tsai is a PhD student in technology and operations management at Harvard Business School. She is interested in 바카라사이트 judgements and decisions by people in organisations. Danny Oppenheimer is a professor jointly appointed in psychology and decision sciences at Carnegie Mellon University. He researches judgement, decision-making, metacognition, learning and causal reasoning, and applies his findings to domains such as charitable giving, consumer behaviour and how to trick students into buying him ice cream.
If you found this interesting and want advice and insight from academics and university staff delivered direct to your inbox each week, .