The heart of the loop: Reattempts without penalty

This article originally appeared earlier this week at Grading for Growth, a blog about alternative grading practices that I co-author with my colleague David Clark. I post there every other Monday (David does the other Mondays). Check the end of this post for some extra thoughts that don't appear in the original.

Click here to subscribe, and get Grading for Growth in your email inbox, free, every Monday.

This post is the final installment in a series on the Four Pillars of Alternative Grading. The first post focused on Clearly Defined Standards, the second one on Helpful Feedback, and the third one on Marks Indicate Progress. The final one is in some ways the most important, and the most controversial: Reattempts Without Penalty.

Every alternative grading scheme we have mentioned here on this blog has this idea in common: That students are allowed to reattempt work and resubmit it for feedback, and they incur no grade penalty for doing so. Grades are not primarily based on "one-and-done" assessments, and early missteps can be corrected and improved without cost to the student's grade.

You can see why this idea is so provocative. For students, it holds out the promise that there is no more final judgment based on single moments in time --- that they will be allowed to improve and not just be expected to perform. For some instructors, it provides hope that student growth will (finally!) be the primary measure of success in a course, and some measure of grace and flexibility will be included along with high standards and "rigor". And for other instructors, this concept raises more questions than answers. Won't this just allow students to not take assessments seriously? Won't this cause a massive workload for me? and more.

So let's unpack this idea, starting with what we know.

How it traditionally works

In traditional, points-based grading systems, the evidence that students present about their learning is almost always in the form of one-and-done assessments: Tests, exams, homework, presentations, and the like. Those assessments can take on various forms, and in well-constructed courses they do have varying forms, corresponding to different levels of Bloom's Taxonomy. But no matter the form, they are are typically one-and-done: Students turn them in, and the work is graded, and that's that. Only rarely, and often only in certain disciplines like writing-intensive subjects, do you encounter the possibility of reattempts; and often those have penalties attached, by which we mean the reattempt does not earn the same credit as the original.

One-and-done assessment is clearly a terrible way to measure student learning. It captures a single moment in time, and this is taken to be representative. But if you were reading a research article and the author used a sample size of n = 1, how would you react? It makes sense as long as you don't think about it. But when you do start thinking about it, major issues arise. How do we know there were no confounding variables --- a.k.a. "life" --- contributing to a student's performance on a test? How do we know that Alice learns at the same speed as Bob, and that she wouldn't be better than Bob at the material if she had another week to work on it?

So the traditional one-and-done approach has many flaws. But there were smart and compassionate instructors who lived earlier than us, so how did this come to be the norm? I honestly don't know, but I suspect it's a combination of:

Convenience. One-and-done approaches are less work for everyone, especially the instructor. Nobody likes traditional grading because it is so soul-sucking and time-consuming, so why do it more often than necessary? This has a connection with the next point.
A misplaced trust in statistics. An argument for traditional grading goes like this: Sure, a single assessment might have a grade on it that doesn't accurately reflect student understanding. But factoring in all the assessments, the statistics done to compute the course grade will even out the noise; and in the end, the central tendency of student grades will be accurate. (Especially if "drop grades" are introduced to trim the outliers.) This is known as regression to the mean and it is a useful concept in certain contexts. But it has a fundamental flaw when attached to grading: The marks we feed into the stats are not really measurements. I promised last week to expand on this idea in the future, and I will. For now, I will claim: Although we put numerical points on student work, these are not truly numerical data but rather ordinal categorical data --- ordered labels, in other words. We shouldn't be performing statistics on these in the first place except for maybe modes, medians, and max values.
A fixation on rigor. The idea of letting students redo work strikes many people as soft. It leads to "grade inflation". It violates some kind of academic machismo code that views student assessment like competition in an arena. Now, we are all for high academic standards, and in fact alternative grading makes it possible to have higher standards than we've ever had, precisely because we don't give one-and-done assessments. But when academic standards and the related concept of "rigor" become the antagonist in your course and students the protagonists --- or maybe it's the other way around --- the environment becomes toxic, and the focus is moved from students and their growth to the abstracted concept of rigor, which we've written before is a meaningless term that should be abandoned, not celebrated.
Tradition itself. And of course, there's good old-fashioned inertia. Most of us instructors came up through courses that had one-and-done assessment, and it "worked for us", so we use it now. We never stop to think about the people for whom it didn't "work".

No penalties? Really?

Many instructors can get behind the idea of reattempts, but the idea of not penalizing them seems hard to swallow. It feels "un-rigorous" or a contributor to grade inflation, or just plain unfair, especially to students who did adequate work on the first attempt. But we really mean it: Reattempts should not be penalized and none of the alternative approaches seen here do so.

The reason is simply that growth is what grading is about, or should be about. And why would you penalize growth? Engagement with a feedback loop and the growth that takes place a result is a normal, healthy part of human learning. It's not a sign of a defect or a deficiency. Instead, we acknowledge that learning takes time, time and effort, and that means normalizing reattempts.

Does allowing reattempts without penalty lead to grade inflation? Not really. “Grade inflation” refers to increases in grade levels without a corresponding increase in the quality of the work. That second part is key; simply having higher grades by itself isn’t grade inflation. Reassessments let students demonstrate that they have actually learned. When we allow reassessments without penalty, grades do tend to increase, but they are tied to concrete evidence of improvements in learning. This isn’t “inflation” — it’s accuracy, and validity.

What about the objection that reattempts without penalty are unfair to students who do good-enough work on the first try? This seems to stem from a combination of two misplaced ideas about grades: that they are compensation, or that they are the result of a competition.

If you worked 8 hours at a job that pays $15 an hour, and I worked the same job for 4 hours, and we were both paid $120 at the end of the day, or if we were both paid $60, then that's unfair — because there is supposed to be a precise relationship between the amount of effort and the payment received, and it's broken. But grades are not like this, or at least they shouldn't be. If they are like rewards at all (a concept I am not comfortable with, but I'll accept it for this analogy) it's a reward for simply finishing a job, like paying someone $50 to mow my lawn. It doesn’t really matter to me how long it takes, as long as it’s done.

Many also view grades not so much as compensation, but rewards for placing high in a grading competition. If you see the point of taking a class as beating everyone else in the class — an approach sadly common among American students — then allowing others to earn the same grade as you on an assessment or in a course, even though the other person struggled where you didn’t, this can feel unfair. For those students, you can explain: What we are trying to do is not rank students but measure learning. There is no fixed, artificial amount of A, B, C, D, or F grades and so no need for competition to grab them — and one person’s success is not going to lessen another’s chances of success. There is no “curve” and every possibility that every student can earn an “A” through hard work, effort, and engagement with the feedback loop. So relax! Every student is working to meet the same standards and we are all on the same side.

Others might object that reattempts without penalty discourage students from doing good-enough work on the first try. There's something to this objection: Often the earlier objectives in a course need to be mastered in order for students to get the most out of the subsequent concepts. If students have less incentive to do that, there's a danger they'll fall behind if they don't give their best effort on the initial tries. It’s a legitimate concern; but this can be addressed through mindful course design (see the next section) and regular communication.

How to reassess without penalty

David wrote a post a while back that goes into detail on some of the mechanics of reassessment, and how to keep out of "grading jail" when you give reassessments. I'll try not to repeat his article. But I do want to stress that "reassessments without penalty" does not mean reassessment without responsibility. While we shouldn't penalize growth, we can design our courses so that students don't take the reassessment opportunities the wrong way.

Instead of penalizing, you can place reasonable limits on reassessment opportunities. For example, you can:

Limit the number of reassessments. In my Discrete Structures courses where I use specifications grading, for instance, when a new Learning Target is introduced, it appears on only three consecutive Learning Target quizzes, then it is "retired". That means that a student can continue to be assessed on it if needed, but only by request and with a cost (i.e. spending a token). This addresses the concern about students giving their best effort to do well on the first 1-2 tries; and it keeps the size of the quizzes down.
Limit the schedule of reassessments. For example, only do reassessments during your office hours; or only on Fridays, or every other Friday. An advantage of this approach is that it introduces time for thought and practice --- students can't typically take an assessment and then turn around mere hours later and do a reassessment.
Limit the frequency of reassessments. This approach works well with writing-intensive work. In my Modern Algebra class, which is primarily based on written mathematical proofs, students get two problems a week; they can revise any of these as often as needed, but with a cap of one problem per week. So revisions of problems aren't penalized, but they are scarce. The scarcity drives up the value, and it keeps my grading workload manageable. (Students can, if they want, submit multiple revisions of the same problem each week.)

Not only can you place reasonable limits on reassessments, you can also require extras:

Require a metacognitive reflection. When a student submits a revision of a proof, require a brief but substantive reflection that summarizes the important items that caused issues on the previous submission, along with a specific explanation of what they did to improve their understanding and how they have demonstrated that improvement.
Require evidence of successful practice. In skills-oriented assessments, you might require students to work additional exercises related to the work being reattempted, and submit those as a down payment on a revision, or alongside a revision. You don't necessarily have to grade the additional practice, just give it a once-over to make sure it's mostly OK and done with good-faith effort. Then you evaluate the reattempt.

Finally, on a reattempt, you can ask students to mix up their methods. For example, you might require some (or all!) reattempts to be done orally in your office hours; or perhaps through a Flipgrid video. This isn't a penalty by any means, but it asks the student to try again but this time in a different way. That can help you be more sure that the reattempt is not just done mindlessly; it can also help the student, because changing up the approach can help them learn and retain the concepts better.

Above all: Feedback loops

What I've stressed through all of these Four Pillars articles is what's on the "ceiling" of the visual: Feedback loops.

All human learning happens by engagement with a feedback loop, and we are simply remodeling our grading practices to be more in line with that fact. Giving reattempts without penalty is, in my view, the visible sign that we are serious about that alignment. But all of the pillars are grounded in the idea that grading should be about growth, and this takes time, effort, communication, and patience.

Bonus extra thoughts

Beware of pushback from students if you use a policy of reattempts without penalty. Wait, what? Students? Why would they push back on this when there is virtually no downside for them? I know it seems weird, but it's real. The ones who push back are the highest-flying students, the ones who take great pride in getting things right the first time. They are the ones for whom this concept will seem singularly unfair. In fact the first time I tried a reattempt policy in a class — a Calculus 2 class for engineers at Vanderbilt, so high-flying students indeed — one of them accused me of practicing "academic communism". I'm no therapist, but I think there's a line between taking pride in work well-done and centering how you value yourself as a human being on getting good grades. In some ways I think those kinds of students need growth-focused grading practices more than the ones who struggle with the material.
There's another objection to reassessment policies related to something I mentioned above: If students can reassess (particularly if there's no penalty) then they might not put their best work into an early concept, which is then built upon a later concept which is also assessed, and so on — creating a snowball effect where students are having to assess on newer topics before mastering the older ones. This snowballing is a little worse than simply falling behind in a class because it creates exponential growth in student assessments. This too is a legitimate concern. It merits a deeper dive, but for now: (1) I think you can address the snowballing issue pretty effectively just by frequent, high-quality communication with students and by keeping the individual standards simple, and (2) we have to remember humans don't learn in a linear way, where we master a subject first before moving on to something else that depends on the earlier subject. We're constantly having to go back to prior knowledge and practice, fill in gaps, unlearn bad habits, etc. So while the snowball effect can be serious, I don't see it as a glitch in alternative grading setups but an organic reflection of how lifelong learning takes place. It has to be managed but its presence doesn't mean the system is broken.