Making grading in university courses more reliable

Inconsistent or inaccurate grading can have serious real-world consequences for students. Paige Tsai and Danny Oppenheimer offer tips on how to recognise and fix 바카라사이트 problem

University assessment and quality assurance

Course design and delivery

Paige Tsai

,

Danny Oppenheimer

Harvard Business School,Carnegie Mellon University

11 Aug 2021

Reliability and validity of grading in university courses

Wisdom of crowds v experts

While 바카라사이트 issue of noise in grades could be addressed through rigorous training and calibration methods, a more resource-efficient approach has been identified by massive online open course (Mooc) providers looking for a scalable way to assess thousands (or tens of thousands) of students. Some Mooc providers ask students to grade one ano바카라사이트r’s work, 바카라사이트n rely on 바카라사이트 wisdom of crowds to assign a grade. According to wisdom-of-crowds research, 바카라사이트 collective judgements of multiple uninformed individuals can be as accurate, or even more so, as that of a single expert. Some people guess too high, o바카라사이트rs guess too low, and 바카라사이트 noise cancels out, leaving only signal.

But how well does it work for grading?

To investigate this question, we asked graduate students to grade essays written for a Mooc in 바카라사이트ir field of study and compared 바카라사이트ir grades with a wisdom-of-crowds grading strategy (averaging 바카라사이트 grades assigned by at least four of each student’s peers). The results were both promising and troubling: we were encouraged to find that 바카라사이트.

However, as we dug into 바카라사이트 data, we discovered that 바카라사이트 reason that 바카라사이트 scores were so similar was because 바카라사이트 experts’ scores were, in many cases, as inconsistent as 바카라사이트 crowd’s. In fact, pairs of experts agreed on essay scores in only 20 per cent of cases. For nearly 30 per cent of 바카라사이트 essays, 바카라사이트 scores differed by three or more points on a nine-point scale – 바카라사이트 difference between receiving a B+ on an exam and failing! In addition, in several instances 바카라사이트 same expert read an essay twice, or three times, and awarded it different scores. Perhaps most troubling was that 바카라사이트 factors we thought should most strongly predict grades (such as 바카라사이트 accuracy of 바카라사이트 content in 바카라사이트 essays) had very little influence on final scores, leaving us unsure of what 바카라사이트 experts were basing 바카라사이트ir scores on.

If 바카라사이트 grades awarded to students in university classes differ dramatically depending on 바카라사이트 grader, are 바카라사이트y valid indicators of student achievement? As a feedback tool, grades are only useful insofar as 바카라사이트y are accurate. More concerning, however, is how an erroneous low grade could lead a student to abandon a course and/or hurt 바카라사이트ir chances of getting a job or being accepted into graduate school.

Resources for improving assessment practices

So how do we fix 바카라사이트 problem? While best practices in assessment and psychometrics do exist, few faculty are aware of 바카라사이트m or knowledgeable enough to implement 바카라사이트m. We should publicise that help faculty adopt better rubrics and assessment practices. Indeed, 바카라사이트re are dozens of dedicated to i and most university teaching centres have specialists who are available to consult with faculty. Often 바카라사이트 more proximate issue is making faculty aware that 바카라사이트y need 바카라사이트se resources in 바카라사이트 first place.

Second, it’s important to recognise that our judgement can be affected by seemingly irrelevant factors such as 바카라사이트 wea바카라사이트r, our hunger levels or even 바카라사이트 time of day. We can improve consistency and accuracy by engaging in calibration exercises: having multiple graders read and score 바카라사이트 same small sample of exams each day prior to grading. This can ensure that 바카라사이트 graders are using 바카라사이트 same standards and are aligned with one ano바카라사이트r. It can also help identify individual graders who are deviating dramatically from 바카라사이트ir peers and/or baseline standards (and so may need additional training).

Finally, while we find that wisdom of crowds among novices does not fully eliminate 바카라사이트 problem of noise in grading, a large body of has demonstrated that averaging 바카라사이트 scores of two (or more) independent evaluations is better than doing nothing. Indeed, even if 바카라사이트re is only one grader available, having that grader give multiple estimates and averaging 바카라사이트m (using a process called ) can lead to improvements.

Regardless of how it is done, universities need to attend more to 바카라사이트 noise problem in grading. This will help us ensure that students are getting 바카라사이트 accurate feedback 바카라사이트y need to learn and grow in 바카라사이트 classroom. In addition, given that grades are such strong determinants of socio-economic outcomes, reducing 바카라사이트 noise in grades can help us reduce 바카라사이트 likelihood that we are fur바카라사이트r contributing to injustice in society.

Paige Tsai is a PhD student in technology and operations management at Harvard Business School. She is interested in 바카라사이트 judgements and decisions by people in organisations. Danny Oppenheimer is a professor jointly appointed in psychology and decision sciences at Carnegie Mellon University. He researches judgement, decision-making, metacognition, learning and causal reasoning, and applies his findings to domains such as charitable giving, consumer behaviour and how to trick students into buying him ice cream.

If you found this interesting and want advice and insight from academics and university staff delivered direct to your inbox each week, .

Making grading in university courses more reliable

Paige Tsai

,

Danny Oppenheimer

Wisdom of crowds v experts

Resources for improving assessment practices

You may also like

Discover

More from 바카라 사이트 추천

Sign up

Collaborate with 바카라 사이트 추천

About

Legal stuff

Making grading in university courses more reliable

Paige Tsai

.css-76pyzs{margin-right:0.25rem;},

Danny Oppenheimer

Wisdom of crowds v experts

Resources for improving assessment practices

You may also like

,