Specifications grading and "help"

Can students in a specs grading setting do work without help? Do we care?

Specifications grading and "help"

This is a repost from Grading for Growth. I post there every other Monday (my colleague David Clark does the other Mondays) and usually repost here the next day. Check out the bottom of this post for some additional thoughts that didn't appear in the original!


Recently, a colleague emailed to let me know he was thinking about using specifications grading in his courses this fall, but had some questions. The ensuing back-and-forth forced me to think about some wide-ranging issues about grading that David and I will be writing about here in the future. My colleague had read a series of posts I published at my website in January, especially this one, about the design of my Modern Algebra course, and asked:

How do you know what the student can or cannot do on his/her own (without non-trivial amounts of help)? In general, I like the idea of the proof portfolio; but, do you ever get students who can’t produce a correct, coherent argument without help?

Background: Modern Algebra, an upper-division course for mathematics majors, uses a specs grading approach that centers heavily on students assembling a portfolio of their work on proof-based problems in abstract algebra (the “proof portfolio”). As with any specs grading implementation, if a student turned in work that wasn’t up to specs, they got feedback and the opportunity to revise and resubmit, iterated until the standard was met.

I think my colleague’s questions are important and good. Here’s a summary of my reasoning through them.

How do I know what the student can or cannot do without nontrivial amounts of help? Like a true mathematician, I’ll begin by questioning the premise and the terms of the question itself. First, what do we mean by “help”, and how much of it is a “nontrivial amount”?

There are some obvious cases where the type and amount of help are not academically acceptable. For example, a student plagiarizing an essay or proof and turning it in as their own work is “getting help” of the wrong sort and in way too great of a quantity. Or, a student in an IBL class who is unable to put their solution to a problem on the board without having a friend coach them through every step, is also getting too much help.

But once we step away from the obvious, things get very murky very quickly. Suppose that student in the IBL class does put their solution on the board and fields reasonable questions about it, but only because they practiced it the night before with a friend. Is that “too much help”? Or, suppose the other student didn’t plagiarize the essay or proof but instead turned in work that was their own, and it didn’t meet quality standards, but then they got feedback and revised and resubmitted until it did. Is that “too much help”?

If I look at the end product — the solution on the board, the proof or essay — how do I know how much of that product is the student’s thinking and not influenced heavily by another? Where does the student stop, and their sources of “help” begin?

In one sense, once you get past the obvious academic-dishonesty level situations, this question is unanswerable. When I look at a student’s work, if it’s not violating academic honesty rules, it’s simply not possible to disassociate the student’s thinking from that of the others with whom the student has interfaced to help their thinking (which is one way to think about “getting help”). A social constructivist view of learning would agree, since through that lens, there is simply no such thing as “working without help” because all knowledge is built through interaction with others (another way to think about “getting help”). So I can evaluate what the student does and whether it meets specifications; but I can’t determine how much of that is only from the student, in fact it’s possible that none of it truly is “from the student” and not significantly rooted in other people and resources.

But I could be wrong. Maybe it’s possible to set up a situation where the amount and types of help available to students are controlled for, which would give a clearer insight to what the student can or cannot do on their own. Well, in fact, I know that this is possible, and we’ve been doing it for centuries in the form of timed assessments.

The whole point of a timed assessment — one-and-done, without the possibility of revision/resubmission later, since that would be a form of “getting help” — is to limit access to time and resources, to see what students “really know”. But let’s consider the  consequences of that approach:

  • Timed assessments without the possibility of revision ramp up anxiety levels, triggering a fight-or-flight response that adversely affects cognition — effectively corrupting the data about learning collected by the assessment. (David wrote about some related research on this last week.)
  • Timed assessments tend to be based on memory and recall, which can be different than “knowledge”. When I give a quiz over the Chain Rule in calculus, what I’m really giving is at least as much of a quiz over students’ ability to remember the Chain Rule, as it is their ability to apply it. If you believe this is splitting hairs over semantics, try giving two versions of such a quiz, one with notes allowed and the other without.
  • Timed assessments are notoriously brittle with respect to logistical issues, something we all experienced during the COVID-19 pandemic as students had to enter quarantine and deal with their lives outside class.

So we’re kidding ourselves if we think that one-and-done timed testing controls for the use of outside help without letting in all sorts of other confounding variables. You might address the issue of students not being able to do work without help. But doing so, introduces more problems than it mitigates, and it’s questionable whether the data you receive really measure what you think it measures. Even if it looks like independent work, it might just be help on a time delay.

That gets me to another angle on this question. Is this idea of students being able to do work without significant amounts of help something we even want in the first place?

The word “independence” plays a big role here. Maybe it’s not so much that we want students to build knowledge of the subject without any interfacing with another human being at all — that’s probably not possible. All we want, maybe, is for students to be able to demonstrate their fluency without a lot of explicit coaching. Independently, in other words.

In my conversation with my colleague, one of the things I noticed was that in my department’s official syllabus of record for Modern Algebra, there is no course outcome regarding the idea of “independence”. That doesn’t mean we find this unimportant, just that it’s not something we instructors are charged with assessing. So, I don’t. My course objectives and the module-level objectives that flow from them address goals about writing, problem-solving, thinking with abstraction, and so on. But I make no claim that students should do this “independently” (apart from not breaking the academic honesty rules) and I don’t have any mechanism for demonstrating independence. Until this email discussion, I’d honestly never even thought about it.

People learn by engaging with feedback loops. This is a fundamental truth about humanity and the core idea of mastery-based grading. So while there is value in “independent work”, there is more value in students demonstrating the ability to take help that is offered and use it to grow. If a student can demonstrate independent thinking, but cannot grow through feedback — through help — then I do not consider that a success. But if the student can show evidence of growth through feedback loops, that is a success, and whether or not the student struggles to enter the loop “independently” is simply not that important.

But: It’s certainly not wrong to value independence, however you may operationalize it, and make it something you do assess. If you really wanted to measure whether students can do work independently, I’d suggest making it into a specs grading bundle. Include in this “Independence Bundle” whatever materials you feel like you need to determine that students can do work without lots of explicit coaching. Perhaps to complete the bundle, students have to pass four oral exams in your office during the semester; or a certain number of learning targets have to be mastered using timed assessments (although, see above); or whatever. Take my colleague’s question and turn it around: What evidence can students show you that they have attained independent mastery? Then, ask for it, and evaluate it according to specifications.

Likewise, am I saying that we shouldn’t use timed assessments? No. I’ll be doing it myself this fall. But if you do, then those timed assessments should be part of a well-balanced diet of assessments; the timed assessments might target specific learning outcomes on the low end of Bloom’s Taxonomy, supplemented with other, untimed assessments that target the top end. Students who don’t meet the specifications on their timed assessments could have the option to take a similar problem on a later quiz, or do an oral exam or make a video of themselves instead. And every reattempt, timed or otherwise, should require students to do metacognitive work, to show that they have thought about their previous work and the feedback it receives.

And ultimately, the instructor is the final arbiter of how much help is “too much”. How much help are you willing to give to a student, or allow a student to incorporate, and still be satisfied that their work demonstrates individual mastery of the topic? That’s the limit, and give all the help in the world to that student up to that point — and no farther.


My colleague asked a second question: Do you ever get students who can’t produce a correct, coherent argument without help? My answer might have seemed flip: All the time, and I’m one of them. Ask my students, and they’ll gleefully tell you that I never do even a basic arithmetic calculation during class unless I check it on Wolfram|Alpha at the same time. I’ve lived long enough to have a deep mistrust of my supposed independence. But I do trust my ability to learn things. In the end, the whole point of specifications grading and other methods like it is to build that skill, of seeking out and taking help and making it truly helpful for yourself.


Some additional thoughts:

  • A comment on the original post was wondering if I was against timed testing. I'm not; just against timed testing without the possibility of growth or improvement. My own use of timed quizzes involves being able to reattempt similar versions of problems on later quizzes, or replace them entirely with in-person/Zoom oral exams or student-created video. My point here isn't "timed testing bad!" but, rather, the notion that one-and-done timed assessments control for outside help while keeping everything else the same, is naive. The timed, one-and-done environment introduces confounding variables that, to me at least, make me seriously question the internal and construct validity of the test. If you give a timed test, you just have to be aware that you may not be measuring what you think you're measuring.
  • Another comment, from email and not on the original, wondered how the idea of specifications grading and the possibility that students aren't showing enough independent mastery of work plays with employers and graduate schools. It's fine and good, that is, for a person just taking the class for general education credit to be able to get a good grade without being able to do work without "significant help"; but what about people heading for Ph.D. programs or serious jobs in the field? Two responses to this. First, I've been using mastery-based grading for around seven years now, and I'm pretty good at staying connected with former students; not once has an employer, graduate school, or student given any indication that their experience in specifications grading put them at a disadvantage. On the contrary, what feedback I have gotten is that the system helped them to feel more at ease and have a more enjoyable experience, which helped them to have greater self-efficacy, intrinsic motivation, and openness to learning. Second, I recall very clearly some of the responses from the Steelcase people when I'd ask them what they were looking for in a new hire or an intern. Not a single person said anything about "independent mastery" of content knowledge. All of them, on the other hand, repeatedly said that the most important thing interns and new hires needed to have was the ability to learn things quickly, and to use feedback to learn independently. So yes, independence is valuable, but not in the ways you might think.