Something happened this past week that made this whole series I'm doing about the building of my Fall courses become a lot more real: July ended. Now that I'm writing down an "08" for the month, I'm reminded that all this building isn't just an abstracted academic exercise but will come into contact with real students in just a few weeks' time. So I'd better get on with it.

In the previous posts, we've looked at the choice of modality for my Calculus class in the Fall, the all-important process of determining learning objectives, and how the activities and most recently the assessments will take shape. We've come now to the point that might interest a lot of people: The grading system for the class. We have a lot to talk about here.

When I first started teaching, it wasn't uncommon for me to *start* with the grading system when building a class. I'd start with the axiom that we'd have three exams and a final, plus some quizzes (maybe homework). If I felt daring, I might throw in a project or presentations. And then determine the number of points for these, then declare that 90% of the total was an A, 80% was a B, and so on. Easy, right? That's why it's tempting to start there.

I think those of us who teach and who have graded using points – which is pretty much all of us – have always had some deep suspicions about point-based grading systems, which we repress, because examining those suspicions would require a comprehensive rethinking about... everything. Six years ago, 15 years into my teaching career, I finally put those feelings into writing, and decided that I'd had enough of the tyranny of points. And thus began an exploration of what we now call **mastery grading** that continues today, and in this article.

Very briefly, mastery grading is an umbrella term, including such practices as standards-based grading and specifications grading, that refers to grading practices that share some common characteristics:

- Typically,
**student work is not graded using points, but instead is evaluated relative to clearly-stated criteria**that describe acceptable quality. The grade is assigned with a two-level rubric, not necessarily literally "Pass/Fail" but some binary determination of whether the work meets the criteria or doesn't. Sometimes more finely-graded rubrics are used. - Since there are no points, there's no concept of partial credit. Instead, students get
**significant, helpful, actionable feedback from the professor**that gets them thinking about how to improve. - Finally, mastery grading builds in ways for students to take the feedback and
**revise and resubmit their work**,**in a feedback loop that continues until the work has met the criteria**(or the student runs out of opportunities).

I really liked the way Tri-County Early College phrased it: *"All assignments must be completed at a level of competency and are in-play as long as that takes (i.e. grades are never used as a punitive measure and zeros are never given)." *[My emphasis]

That's a very low-resolution overview of the concept. There's much more at the Official Mastery Grading FAQ.

Here's a description of how each *individual *assessment is graded and how students can revise and resubmit their work, according to the principles above.

**Checkpoints** are take-home exams for demonstrating skill in the 24 Learning Targets of the course. Each Checkpoint contains one (multi-part) problem targeting exactly one Learning Target, for each Learning Target that has been discussed in the class up to that point. These are cumulative, with each Checkpoint containing not only new problems for recent Learning Targets but new versions of problems from previous Checkpoints. Since last week, I've written two sample Checkpoints to illustrate the idea: Here is the sample Checkpoint 1, which will be given in week 2 of the course; and here is the sample Checkpoint 2, which will be given one week later. (If you're wondering, here is an initial schedule of when each Checkpoint will be released and what new targets it will contain in addition to all the older ones.)

Each problem on a Checkpoint is graded simply "check" or "x" depending on whether the student work meets the criteria. The schedule linked above also has brief descriptions for each Learning Target about what students will be asked to do and what will constitute acceptable work. When a student submits work on a Checkpoint (via working the problems out on paper and then submitting a scanned PDF to the LMS), I'll look it over, determine whether it's a "check" or an "x", then put the grade in the LMS along with written feedback on the work. If the grade is an "x", the student can attempt a new version of that problem on the next Checkpoint (or use an alternative method, which I described last time).

**Application/Extension Problems** (AEPs) are problems sets used for demonstrating skill in applying basic knowledge. I have 8 of these planned and will start drafting them next week. These are more extensive and nuanced, so instead of "check/x" I use the EMRN rubric which I showed last time. This work is also submitted as a PDF on the LMS, and as with Checkpoints, I examine the work, assign the letter, then give helpful feedback. Then students can revise and resubmit any AEP that earned an M, R, or N, subject to the **Two-Item-per-Week Rule **(no more than two AEP submissions can be made in any given week) and the **Revision of N grades Rule** which states that work that receives an "N" (Not Assessible) requires spending a token (see below), in order to prevent students from submitting work that is highly flawed or incomplete on purpose just to get partial feedback. Otherwise there are no limitations on revision and resubmission.

The **Daily Prep** and **Followup Activities** that bookend students' engagement with content are also graded check/x on the basis of completeness, effort, and deadline compliance. They are trivial to grade. However, unlike other assessments, there's no opportunity to revise or resubmit; they are meant to be done once and correctness isn't part of the criteria. I am OK with that since it's so easy to earn a check on them — just do something reasonable for each item and turn it in on time!

All of the other assessments in the course are graded with points, but with good reason. **Online homework** is done through WeBWorK, which uses points and there's nothing I can do about that; however the system does allow students to redo problems as many times as they want if the answer is wrong. And the miscellaneous **Engagement Credit** opportunities will be graded either 0 points or 1 point, but these are just labels indicating that the opportunity was either done or it wasn't. It's just clearer to talk with students about accumulating *points* for these than to say something like "*accumulate at least 90 'acceptable' marks*".

I still have to give course grades of A, B, C, D, or F, so having determined how each individual assessment is graded, the next task is figuring out how to map student work onto these five letters and their plus/minus variants.

First, one more thing about Learning Targets: There are two levels of attainment with these. Earning a single "check" on a Learning Target earns you what I call *Proficiency *level with that target. Earning a *second* check on that Target — by working a second Checkpoint problem sufficiently well, or using an alternative method – takes you to the next level which I call *Mastery*.

Back to the course grade, I find it helpful to start with the "C" grade. A "C" is considered baseline competency in the course — the "C" student has demonstrated the minimum level skill to warrant allowing them to go on to other courses that use this one. What does that look like? Individual instructors will have different settings here, but for me, baseline competency means:

- The student has demonstrated Proficiency on all the Core learning targets and Mastery on at least half of them.
- The student has demonstrated Proficiency, if not Mastery, on about half of the non-Core or "Supplemental" Learning Targets.
- The student has done a significant but not overwhelming amount of acceptable work on applications.
- The student has done a fair amount of correct work on the online homework.
- The student has demonstrated reasonable engagement with the class.

If I could truthfully describe one of my students in these terms, I'd feel OK — if not supremely confident — that if they went on to Calculus 2 or Physics or something that requires my class, they could succeed if they work hard and get help when needed. That, to me, is what a "C" signifies. On the other hand if one of my students *did not* meet one of the above descriptions, I'd have reasonable doubt about whether they are "baseline competent" in the subject, and this would warrant a grade below C.

From this broad description, it's fairly easy to make **specific criteria **for earning a C in the class. A student earns a C, if they satisfy

**Earn Proficiency on all 10 Core learning targets**and**Mastery on at least 5 of them**.**Earn Proficiency on 6**(out of 14)**Supplemental Learning Targets**.**Earn either an E or an M on 5**(out of 8)**AEPs**.**Earn at least 140 points**(out of 192)**on WeBWorK problems**. (That's 73%.)**Earn "check" on a total of 34 Daily Prep or Followup Activities**. (There are 24 of each of these, and we are counting the*total*. So if a student struggles with the Daily Prep, they can make up for it by doing more Followups, and vice versa. Also, that's 71%.)**Earn at least 60**(out of 100)**engagement credits**. (Daily Prep and Followup Activites earn one engagement credit per "check", so satisfying the previous bullet earns 39 out of those 60; the other 21 come from miscellaneous activities.)

Students have to satisfy *all* these requirements to earn a C. If they miss one, their grade will be at most a C-. This is jarring to students because they are used to having poor performance in one assessment "made up for" through good performance on another. But here, I require **across-the-board competency**. Even if a student earns Mastery on all 24 Learning Targets for example, but doesn't show competence on the AEP's, they will not earn a C.

To get the criteria for a grade of B, the idea is that **it's everything needed to earn a C, plus extras **in terms of quantity, quality, or both. Likewise for an A, the criteria are **meet the requirements for a B, plus extras**. Here's what I decided on for these requirements, as well as requirements for a grade of D:

Category | D | C | B | A |
---|---|---|---|---|

Core Learning Targets (10) | 5 Proficient | 5 Proficient, 5 Mastered | 10 Mastered | 10 Mastered |

Supplemental Learning Targets (14) | 3 Proficient | 6 Proficient | 6 Proficient, 3 Mastered | 6 Proficient, 6 Mastered |

AEPs (8+) | 2 M+ | 5 M+ | 2 E, 4 M+ | 4 E, 2 M+ |

WeBWorK (192) | 90 | 140 | 160 | 180 |

DP + FA (48) | 24 | 34 | 39 | 44 |

Engagement credits (100+) | 30 | 60 | 70 | 80 |

"M+" means "either an M or an E".

For the D grade, I had to decide what a safety net would be for a student that doesn't show baseline competency, but also shows significant progress. I decided roughly speaking that if a student was "halfway to a C" then that's the requirements for a "D". Students who don't meet all the requirements for a D, earn an F.

A common feature of mastery grading systems is the **token**, which is fake currency (think: Bitcoin but for grades) which students can spend to bend the rules of the course. In the Calculus class, every student starts the semester with 5 tokens. They can spend them using this menu, where everything costs one token:

- Attempt a second Learning Target in a given week through non-Checkpoint means
- Submit a third AEP (either revision or new submission) in a given week
- Revise an AEP graded "N"
- Extend the deadline on a Checkpoint by 12 hours (request must be submitted prior to the original deadline)
- Extend the deadline on a WeBWorK set by 24 hours (request must be submitted prior to the original deadline)
- Purchase 3 engagement credits

About plus/minus grades: The table above is used to determine the "base grade" in the course, which is the A/B/C/D/F grade without plus or minus modifiers. If it were up to me, there would be no plus or minus grades because I think they complicate things unnecessarily. According to my superiors, though, I *must* have some system for giving plus/minus grades — but there are no mandates for how I do it. Here are my rules for this:

- A "plus" is added to the base grade if all requirements for a base grade are satisfied,
*and*the Learning Target (both Core and Supplemental) or AEP requirement for the next level up is also satisfied;*and*the big-picture portion of the final exam is passed. (I mentioned the final exam last time.) - A "minus" is added to the base grade above in any of the following cases: (1) All requirements for a base grade are satisfied
*except one*, and that one is no more than two levels below the others;**or**(2) the student meets the minimum requirements for a base grade (i.e. none of the requirements for higher levels are met) and but does not pass the big-picture portion of the final exam**or**(3) the student meets the minimum requirements for a base grade but does not complete the Functions Bootcamp satisfactorily by Monday, September 14.

(The Functions Bootcamp is a special unit I am making to get everyone up to speed on mathematical functions.) Like I said, this complicates things and I'd prefer not to deal with it at all. But, since I do have to, I've settled on having a "plus" awarded for completing one grade level and "going above and beyond"; and a "minus" awarded for "almost" situations, or for not doing sufficiently good work on the final — and to use as a stick to get people to complete the Functions Bootcamp.

This system is an iteration of the grading system I described here that I ended up liking a lot, for reasons I like mastery grading in general: It produces fewer false negatives (poor grades but good understanding of the material) and false positives (good grades, poor understanding) and helps alleviate stress and anxiety over grades because of the robust revision policy.

I've used systems like this before and I have a sense of what to expect. Namely, students will find it confusing at first and many will freak out over it, especially given the unsettled nature of this coming semester. It looks weird and hard and nonintuitive and definitely not like their other courses or previous schoolwork. What it takes to help students adapt to it is careful explanation and presentation in the beginning — not necessarily explaining every facet of the system all at once in the syllabus but giving time for student to absorb it — and especially, just living with this system and working with it. Generally my experience is that 2-3 weeks into the course, around about the first exam or Checkpoint, it clicks with students and they come to really appreciate it. It helps to engage in consistent marketing, explaining the benefits of "eventual mastery", being able to redo almost anything in the course, and pointing out the fact that students' grades *never go down* during the semester as the result of an assessment.

I also expect that no matter what I do, some students will continue to dislike this system and prefer the old way, which they believe is clearer and simpler — despite the fact that points-based systems hide so much information about *where the points come from* that you can't really say they are either clear or simple. I just have to be ready to be patient with those folks, and accept that I'll never win some of them over.

In terms of grading load, my experience is that it's about the same or maybe a little less than traditional grading systems, despite the revisions and regrading, because each individual item goes from taking several minutes per student (*Should this get 10 points out of 12? Or 8? or 9? or 6? or 0?*) to seconds (*Is the work good enough or not?*) It's not distributed the same; the work is actually pretty light during the semester but ramps up tremendously at the end (thanks to procrastination). I'll have a lot of work to do in December. I've come to think of it like being a CPA during the US tax return season.

I will probably change this system before classes start. In particular I'm thinking about eliminating the whole engagement credit portion of the grade and just counting the *Daily Prep + Followup* number as a proxy for "engagement". It would simplify things; but I like having incentives for being engaged. Generally, my *modus operandi* is to write up a fully realized course, then tell myself to cut out 10% of it to simplify. I'm not sure what will get cut, but the cuts are coming.

I'm also not sure how students will respond to this system as a result of remote instruction. I've used a similar system in online and hybrid classes before, but never when students were *forced* to do online or hybrid and the world seemed to be spinning out of control. Will this system be just too much for students to handle? Will I need to simplify beyond my customary 10% cuts? I don't know. I'll have to return to this later once more of the course is built up and decide. And when classes start, I ask for frequent feedback, create some simple tools for tracking grades (like a checklist), pay close attention to students, and be ready to change things up.

As we continue to get ready for Fall semester, one of the concerns that I keep hearing is one that I've heard in almost every discussion of online learning since way before Covid-19: That academic honesty is a lot harder, if not impossible, to guarantee in an online course than it is in a face-to-face course. In fact, I've been in meetings in pre-Covid times where academic leaders have dismissed online learning out of hand, simply because of their own personal belief that students will use the online platform to cheat their way through the class. Now that no dismissal of online learning is possible, we need to come to grips with this concern.

On March 15, the night before we threw the switch at my university to make every course online, I wrote this blog post that included this bit:

This is not the time or place to insist on the highest levels of academic excellence, or even airtight mechanisms for ensuring academic honesty. Yes, it's quite possible that students working at home in an online setting could cheat on assignments in ways they may not in a face-to-face setting. [...] There are two ways to respond:s, orBeing OK with thi. The first option is the simpler of the two and so that's what you should go with. Trust students more, and give them more grace and lenience, than you normally do – even more than you are comfortable with. You might be surprised how they respond.setting up a mini-surveillance state

Now that we are coming up on the Second Iteration of the Big Pivot, I would update this in two ways. First, now that we're no longer in emergency mode (despite what it may feel like), we can and should go back to insisting on high levels of academic excellence. Second, there's a third way to respond to the possibility of online cheating other than "being OK" and "police state": Namely, setting up systems that mitigate cheating while still giving students trust, grace, and lenience.

It may seem impossible to insist on high levels of academic excellence on the one hand and at the same time provide trust, grace, and lenience on the other. But fortunately there's a system that gives us the best of both: It's called **mastery grading**.

Of course if you've been around this website for any length of time, you know that I have been a proponent of mastery grading, a.k.a. specifications grading, ever since 2014 when I perhaps foolishly decided on impulse to redesign my entire upcoming semester based on using specifications grading. Here's a more comprehensive list of some of my earliest posts on this subject. Over the course of the last six years, I've iterated on my use of mastery grading to the point where I think I have a stable framework for implementing it in just about any class I teach, even online. So, I'm being a little cheeky in the previous paragraph.

And yet, I don't think enough instructors address the problem of academic honesty by looking at *their systems*. We try to treat the symptoms by anything ranging from grave threats in our syllabi to expensive and Orwellian remote proctoring systems (which sometimes fail under pressure and come with significant privacy concerns). But the real cause of the academic honesty problem isn't that we aren't threatening enough or that we haven't installed the right software. **The problem is what's always been the problem: Points, and the grading systems based on them**.

Think about it: If you're a student, what do you need to do in order to pass a class? Under traditional points-based grading, the answer is: **Accumulate a sufficient number of points**. Note that I do not say *earn* a sufficient number of points. We like to think that those points are a natural metric of hard work and actual learning of concepts and processes. But in fact, the very systems in our syllabi that try to connect learning to points, also make it painfully clear that learning is a means to an end — and that end is the accumulation of points. So it's nice if you can actually *earn* the points through honest means. But when push comes to shove, what matters — as communicated by our own syllabi through points-based grading — is simply their accumulation, through whatever means necessary.

So we set up a deficiency model of learning in our classes by using points as a proxy, similar to the way that in a capitalist society we have a deficiency model of human value using dollars as a proxy. The way you prove your worth in life is to *accumulate. *The way you prove your worth in class is also to *accumulate*. And we wonder what drives people to cheat in either setting? And we wonder why so many students find college to be a dehumanizing experience?

If we are really serious about mitigating academic dishonesty, if we are really serious about caring for students and making their Fall 2020 experience an outstanding one, we'll drop the pretense that this is about F2F versus online, and instead take the simplest and best action possible: **Get rid of points-based grading and adopt mastery grading instead.**

At least three things will improve immediately:

**The incentive to cheat goes away**. Under points-based grading, people cheat because it's high-risk/high-reward. You might get caught and face severe consequences, but you might*not*get caught and accumulate yourself some serious points. On the other hand, mastery grading is predicated upon, among other things, having a robust revision policy for most or all forms of graded work in a course. If you can revise and resubmit just about any significant piece of work — multiple times, and get helpful feedback each time — until you're happy with your grade, then the value proposition of cheating becomes empty.**Student motivation levels rise**. By getting rid of points, the narrative about student grades shifts from game-playing ("What do I need to get on the midterm to have a B in the class?") to concept mastery ("I need to study more on Learning Target DC.2") and this leads to students actually connecting with the material as an end in itself, rather than seeing the material as just a delivery mechanism for points. Making a connection to material improves*competence*. Being able to demonstrate skill in multiple ways — another key tenet of mastery grading — improves*autonomy*. Competence and autonomy are 2/3 of the ingredients for authentic intrinsic motivation (as I wrote about here). And intrinsically motivated students are less likely to cheat than extrinsically motivated ones.**Student stress levels drop**. The flexibility that mastery grading provides to students means that they don't need to stress about some of the major stressors of Fall semester. If they feel sick and end up missing a day of class where there's an assessment, no worries — just take the assessment at the next session, or set up an oral exam. If they are maxed out with a job and taking care of their family one week and just can't put a lot of effort into an assignment — no worries, just do your best, submit a complete draft, and you'll get feedback and another chance. There's a great power in knowing that your grade is based on what you*eventually*show that you know.

I'll be sharing details of my mastery grading setup for Calculus and Discrete Structures soon — it's roughed out but I need to clean it up. I challenge you this fall that as long as we're in a situation where everything else is being shaken to its foundations, why not shake more things up in a good way by upgrading your grading? The benefits of doing so may go way beyond improved academic honesty, although that by itself is enough.

]]>*Welcome to another installment of the 4+1 Interview, where I track down someone doing cool and interesting things in math, technology, or education and ask them four questions along with a special bonus question at the end. This time I caught up with Kate Owens, a professor in the Department of Mathematics at the College of Charleston. Kate is an innovative and effective teacher whose work with students is well worth paying attention to, and she's someone I've enjoyed interacting with for several years on Twitter and elsewhere. *

*You can find more 4+1 interviews here. *

*What's your origin story? That is, how did you get into mathematics, what led you to earn a Ph.D. in the subject, and what led you to the College of Charleston?*

As a kid, I was often bored in math class at school because I didn’t find it particularly challenging or engaging. My dad has a Ph.D. in mathematics and he was always happy to give me new mathematical ideas to think about. In seventh grade, we were supposed to design posters featuring our favorite number, and I picked 43,252,003,274,489,856,000 -- the number of permutations of the Rubik’s cube. I had no idea how to solve the cube, but I was really interested in things like combinatorics and math that wasn’t the “boring stuff” they were making me do in algebra class.

In high school, my plan was to study astrophysics or aerospace engineering. Inspired by images coming from the Hubble telescope, my dream job was to work for NASA. During my first few semesters of college, I was an astrophysics major. One day I realized that I was much happier in calculus than in physics; I spent most of my physics courses feeling confused. More than once I went to my calculus professor asking for physics insight. I got the sense that I spoke mathematician and not physicist, and I changed majors. Eventually I finished my degree in Pure Mathematics from U.C. San Diego. I decided to pursue graduate school in mathematics and I was accepted into the Mathematics Ph.D. program at the University of South Carolina. I finished my M.A. there in 2007 and completed my Ph.D. in 2009.

While in graduate school at the University of South Carolina, I fell in love with another graduate student. He finished his Ph.D. in 2007 and we married in 2009, right as I wrapped up my own dissertation. We spent a long time talking about how we could achieve both our family goals and our career goals, and eventually decided that we would follow his career path -- even if it meant giving up my own job search. My husband accepted a postdoc position in Texas; after a year, he transitioned to an industry job and we moved to Charleston, South Carolina. I had contacts from graduate school and spent a few years at the College of Charleston as a Visiting Assistant Professor before a permanent Instructor position became available. I’ve been teaching here since 2011.

**2. One of the innovations you've championed is the use of mastery-based grading. In your view, what is the purpose of mastery grading, and how well does it work with your students? **

Before I switched to mastery-based grading, I had concerns about how well grades were correlated with student learning. Grades, even those given on assignments early in the semester, always seemed like a final judgement since my students didn’t have a way to demonstrate growth in their understanding. Also, I realized that I couldn’t always diagnose knowledge gaps among my students; many students might earn the same grade on a test for very different reasons. After handing back their assignments, I wouldn’t know how to advise them on what topics they should review or how they could improve. I wanted my gradebook to reflect exactly what content a student knew at this particular time, instead of what percentage of topics they knew at some point in the past.

Now that I’ve switched to mastery-based grading, my gradebook reflects what each student presently knows and what topics they still need to work on. Additionally, it gives me an overview about what the entire class knows already, what they’re still struggling with, and what ideas are most appropriate for us to tackle together next.

The reasons I switched to mastery-based grading are still there, but the two big reasons I won’t switch back to traditional grading are something different. First, mastery-based grading has changed the kinds of conversations I have with students in a fundamental way. I no longer have conversations that begin with questions like, “Why did I get only 8 out of 13 points on this problem?” or “What percent do I need to make on Test 3 to have an average of 88% in the class?” Instead, conversations more often begin with things like, “I don’t how a quadratic equation can tell me if its parabola has *x*-intercepts or not, can you help?” Students are able to track what they’ve mastered and what they haven’t. Second, my system allows for students to improve old scores, so students are incentivized to learn old material that they didn’t quite get the first time. I believe in the importance of having a growth mindset. Mastery-based grading is built on the belief that grades should reflect demonstrated knowledge and that providing many opportunities for the demonstration of newly gained knowledge is important.

**3. College of Charleston is one of the oldest higher education institutions in the United States, founded in 1770. Have you perceived any tension between the history and tradition of the institution on the one hand, and your innovation in the classroom on the other? (If so, how do you make innovation work for you? If not, how does the culture of CofC support both tradition and innovation?)**

You’re right -- the College of Charleston is the 13th oldest educational university in the United States. We are a public, liberal arts college with an undergraduate enrollment around 11,000. The Math Department has over 30 faculty members, whose research areas encompass algebra, numerical methods, logic, number theory, statistics, and more. I believe that our differences in background, research, and instructional methods give us strength as a department. Since CofC is a small liberal arts college, it means that a lot of our mission is about delivering quality undergraduate instruction. Although each faculty member makes different choices in their courses, we have a supportive Department that allows each of us to make our own academic judgments about our courses.

In the Math Department, I’ve helped pilot a program turning traditional, lecture-based college algebra courses into emporium-style classes. In our program, students only work on topics they haven’t yet mastered and in which they have the opportunity to get more one-on-one help on a daily basis. Over the last several years, our data have shown that students are more successful in these college algebra courses as compared to the traditional format, both in terms of course grades and also their raw scores on our departmental-wide final exam for the course. We are now researching longer-term trends of a student’s path through several linked courses (college algebra -> precalculus -> calculus I -> …), and we hope to find ways to raise student success through this course sequence. I’m also piloting an emporium-style approach in precalculus and gathering data about how it’s impacting students.

Outside of the Math Department, one way that CofC supports innovation is through our “Teaching and Learning Team (TLT) for Holistic Development” division. Part of TLT’s mission is to provide support and professional development for faculty interested in cultivating a culture of innovation on campus and in their courses. More than once, I have participated in Professional Learning Clubs about mastery-based grading. They were both a reading group -- we read Linda Nilson’s book Specifications Grading: Restoring Rigor, Motivating Students, and Saving Faculty Time -- and a support group, offering each other instructional ideas about ways to implement mastery-based tasks or non-traditional grading schemes into our courses. I’ve also been a panelist talking about mastery-based grading at TLTCon, CofC’s “Teaching, Learning, and Technology Conference.” There are several faculty members here at CofC who are using non-traditional grading schemes, and I hope our group will continue to grow.

**4. What's something with your teaching and your students right now that you are excited about?**

Our semester is almost over here at CofC. Our last day of classes is April 23 -- only a couple of weeks from now! On the last day of my precalculus course, we have what I call a “Re-Assessment Carnival.” On this day, each student may choose to re-try as many problems as they can complete in the 50-minute class. This gives them a last opportunity to demonstrate knowledge of our course standards before the final examination. It’s an exciting thing to watch: Students are *thrilled *that they’re allowed to take an extra 6 quizzes. From my viewpoint as the instructor, I am thrilled to give out high-fives as they finally master those tricky problems we’ve seen all semester. Mastery-based grading means students can’t get by relying on partial credit, and so they really have to re-visit the tricky topics several times -- but it’s a really great moment when students realize everything has finally clicked.

**+1: What question should I have asked you in this interview?**

What are some projects you’re involved in outside of the classroom?

I’m very involved with our “Science and Mathematics for Teachers (SMFT)” Master of Education (M.Ed.) program. This is an interdisciplinary program designed for in-service middle school and high school teachers. At the end of this semester, two of our students will present their Capstone Projects and officially complete their degrees. I’m excited to see how their projects turn out and how their learnings will impact their classrooms and students.

- Although most of my time is spent on teaching-related tasks, one of the best parts of my job is when I get to be a learner instead of an instructor. Graduate student Colin Alstad is defending his masters thesis (“Categorifications of Dihedral Groups”) later this month. Serving on Colin’s thesis committee has given me a great excuse to keep learning more math -- in this case, some category theory.
- Since 2015 I’ve been the co-Director for the College of Charleston’s “Math Meet,” an annual event held each February. The Math Meet attracts hundreds of students from the region -- this year was the 41st annual Math Meet and we hosted over 450 middle school and high school students from South Carolina, North Carolina, and Georgia. In one day, we offer more than a dozen different events, including three levels of a Written Test, a Math Team Elimination, a Math Team Relay, several Math Timed Sprints, a Physics Brainstorm, a Chemistry Brainstorm, and a trophy presentation in the afternoon. While it seems like the 2019 Math Meet just wrapped up, we have already begun planning for Math Meet 2020.
- Lastly, I’m a parent of three fantastic kids (ages 8, 5, and 3), so I spend a lot of time juggling work-related tasks with gymnastics practice, soccer games, swim lessons, playing outside, laundry, etc. I’m excited for the summer months since it means I’ll have more time to spend with my family. In particular, it’ll mean more time to share some mathematics with my 8-year-old son -- he has decided he wants to become a mathematician when he grows up!

The longer I use specifications grading, and the more I see how differently students experience college courses that use mastery grading compared to courses that don't, the more I believe that the reform of our grading practices is an urgent ethical imperative. Like I said on Twitter last week:

Not just less important - it's clearer every year to me that grades are increasingly corroding education and student well being. The alarm bells are getting louder.

— Robert Talbert (@RobertTalbert) January 18, 2019

I switched from traditional, points-based, no-revision grading a few years ago to specifications grading because I had a strong sense that not only was traditional grading uninformative (large numbers of false positives and false negatives, and no clear link between the grade and what the students can do) but actively harmful to many students in many ways, one of the biggest being *motivation*. When I used traditional grading, students always seemed motivated not by the promise of learning the subject but by the inner game of scoring enough points in the right ways to get the grade they needed to move on --- or else they had no motivation at all.

This intution that traditional grading is demotivating was just that: An intuition. But a study I came across recently gives results about the real effects of traditional grading on motivation.

Chamberlin, K., Yasué, M., & Chiang, I. C. A. (2018). The impact of grades on student motivation. Active Learning in Higher Education, 1469787418819728.

Link to paper: https://journals.sagepub.com/doi/pdf/10.1177/1469787418819728

The authors in this study investigate how "multi-interval" grades (read: the A/B/C/D/F system) affect the basic psychological needs and academic motivation of students when compared with "narrative evaluation", where the instructor gives students verbal feedback both instead of, and in addition to multi-interval grades.

The theoretical basis of the study is self-determination theory (SDT). This framework is where we get the concepts of *extrinsic* and *intrinsic* motivation, where people are motivated to complete a task either by an external reward or for the sake of the task itself, respectively. (For more background, I wrote about SDT and flipped learning in this post.) According to SDT, there are three basic psychological needs that learners have while they are involved in a learning process: **competence** (the need to be good, or at least feel that they are good, at what they are learning), **autonomy** (having choice and agency), and **connectedness** or "relatedness" (being psychologically connected to others while doing the task). Essentially, the more these three needs are met in a learning process, the more intrinsic motivation the learner will experience; the lack of satisfaction of these needs leads to less intrinsic motivation, either in the form of extrinsic motivation or no motivation at all.

They studied 394 students at three different universities. One of those universities gave exclusively multi-interval grades in its classes; another had institutionally eschewed multi-interval grades and used only **narrative evaluations** in its courses. This is where instead of a grade, students get verbal feedback (that is honest, detailed, constructive, and actionable) on what they did and what they need to do. The third used a mix of narrative evaluation and letter grades. The students were given two surveys on academic motivation, and a subset of those underwent semi-structured interviews to dig deeper.

The results are a sobering indictment of traditional grading. Here are just a few that stood out.

Students were asked, among other things, about what information (if any) they got from their grades, whether their grades affected their decisions on what classes to take, and whether their relationship with grades had changed since high school. The prevailing opinion was that grades do *not* convey "competence-enhancing feedback" that can be used to improve; most students could not give any examples of how they used grades to improve their learning. Worse, the information that grades *did* give students tended to be negative signals about the students' self-worth. High-achieving students experienced pressure to achieve high grades; low-achieving students felt condemned by their low grades. All students associate the word "stress" with grades far more frequently than any other concept.

Moreover, traditional grades actively decayed students' sense of autonomy because many times the grade they get and what they have learned seem unrelated. As one student said:

And it was actually pretty frustrating because it felt like even in classes where I was really into the content and worked really hard I came out with a B+. And in classes that I didn’t care about and didn’t work very hard I still got a B+.

Grades worked against relatedness as well, as expressed by some students who described how their relationships with their parents suffered when their grades were poor.

The authors also noted that when discussing traditional grades, students readily adopted capitalist-style business language, for example referring to "cost-benefit analysis" and "payoffs" in describing how they approach class. That's strategic learning and extrinsic motivation taking hold.

The results from students who experienced narrative evaluation were almost completely the opposite of the results from multi-interval grading. Every "narrative evaluation" student interviewed expressed that narrative evaluation gave them usable information about their competence and were more useful than multi-interval grades. The study found strong links between narrative evaluation and enhanced competence, autonomy, and connectedness, and many of the students commented about how narrative evaluation built *trust* between the student and the instructor --- even if the feedback was largely negative.

These results came not just from the interviews but also from the quantitative results of the surveys, with statistically significant differences in measures of academic motivation found between students from traditional grading backgrounds versus narrative backgrounds (with narrative grading leading to higher indicators of motivation). Students from the university that used mixed grading methods experienced some of the benefits of narrative evaluation, but also some of the detractions of traditional grading --- and although the study didn't say this directly, it seems clear to me that the detractions happen because of the letter grades. (If you put a student in a "mixed" environment and give them good narrative evaluations followed by a "B+" grade, guess what the student will tend to focus on?)

So what do we do about this? For me, the course of action is clear: **We need to walk away from traditional grading** --- in which I include not only multi-interval letter grades but also grades based on statistical point accumulation. We've seen enough. Grades are harmful to students' well-being; they do not provide accurate information for employers, academic programs, or even students themselves; and they steer student motivations precisely where we in higher education do *not* want those motivations to go. There is no coherent argument you can make any more that traditional grading is the best approach, in terms of what's best for *students*, to evaluating student work. If we value our students, we'll start being creative and courageous in replacing traditional grading with something better.

Cue the objections about how this can't be done because of transfer credit issues, making non-traditional grading work at scale, etc. I agree partially, in the sense that this move is a long sequence of small steps. The article here is similarly pragmatic and gives some good advice:

Few universities are likely to abolish grades. However, universities should question the conventional use of multi-interval grades and consider their advantages and disadvantages in different departments, years of study, courses and learner types. For example, there may be specific courses or programs [...] in which cultivating deep learning and motivation may be more important than standardized communication of performance to external audiences. For such courses, greater use of narrative evaluations (as opposed to multi-interval grades) may be warranted. In addition, withholding grades from students or providing narrative/ written feedback several days prior to the grades may help students focus on mastery-related learning goals rather than extrinsic rewards.

I'd add the following ideas that I've learned from using specifications grading and hearing about how others use this and other forms of mastery grading:

- It's possible to keep the A/B/C/D/F system for reporting
*semester*grades, but use narrative evaluation and mastery grading instead of points and statistics to determine students' grades. Here's an example.^{[1]} - Do what the article suggests and start changing your grading practices over to something less focused on letters and points, in those courses where narrative evaluation and mastery grading make the most sense: Graduate courses, seminars, proof-oriented upper-level math courses, honors sections of courses, and so on.
- I think you could also make a strong case that introductory courses are also fertile ground for trying out narrative evaluation and/or mastery grading because these are where student motivation tends to be at its lowest point.
- Treat student work --- as much of it as possible --- like submissions to a journal. When we academics submit articles to a journal (or tenure portfolios, etc.) we don't get a point value or letter grade attached. We get verbal feedback with a brief summary: "Accept", "reject", "Major revision", "Minor revision" etc. followed by details. Then assign course grades, if you must have them, based on how much acceptable work the student was able to produce.

There are some practical issues at work here that can't be minimized, for example (and especially) large sections. The issue of scaling is a tough one, but it's not impossible. In my experience with specs grading, doing narrative evaluation takes no more time per student than traditional grading (which involves endless hair-splitting on how many points to give a response), so I don't think there's any reason to believe that nontraditional grading can't scale up.

Moving away from traditional grading could be one of those Pareto principle concepts where focusing intently on this one idea could usher in outsized improvements in many other areas of student learning. I think it could be a fulcrum for bringing about wholesale, even revolutionary change in higher education. Let's give it a try.

Although: I have to admit that recently, I've noticed that students in my specs-graded classes tend to focus laser-like on their grading checklist where they keep track of how many Learning Targets they've passed, rather than on what those Targets represent. In other words the specs end up becoming a proxy for letter grades and students fixate on those accordingly. I'm still thinking about how to handle this. ↩︎

Two and a half years ago, I decided that the traditional system of grading student work --- based on assigning point values to that work and then determining course grades based on the point values --- was working against my goals as a teacher, and I decided to replace it with specifications grading. I had just learned about specs grading through Linda Nilson's book on the subject. This happened right at the end of Fall semester 2014, and I spent the entire Christmas break doing a crash-course redesign of my Winter 2015 classes to install specs grading in them.

I've used specs grading fifteen times since then: once in Cryptography and Privacy, once in Abstract Algebra 2, twice in Calculus 1, four times in Discrete Structures 1, and eight times in Discrete Structures 2. It's fair to say that my implementation has been battle-tested and has undergone a fair bit of evolution in that time. The first attempt in Winter 2015 was pretty rough, but very promising. Every semester since then, I made changes and updates to try to address issues that students and I noticed.

But it was only this last semester, the one that just concluded this week, where I felt that at every point during the semester --- from day 1 all the way through turning in course grades yesterday --- the specs grading system I had in place was working the way I wanted. It's still not 100% there, of course, but I think I have a blueprint of how to use specs grading moving forward^{[1]} and of course, I want to share it with everyone.

In specifications grading, instead of using points to assess student work, the work is graded on a two-level rubric --- that is, some variation on Pass/Fail or Satisfactory/Unsatisfactory. Instructors craft a set of specifications or "specs" for assignments that define what Satisfactory work looks like. When the work is handed in, the instructor simply categorizes it as Satisfactory or Unsatisfactory depending on whether it meets the specs or doesn't. There are no points, so there is no partial credit. Instead, instructors give detailed feedback on student work, and specs grading includes giving students the opportunity to revise their work based on the feedback, and submit a revision as an attempt to meet specs.

Specs grading still uses an A/B/C/D/F course grade reporting approach, but the letter grades are earned differently. Rather than calculating complex weighted averages of points --- which you can't do because there are no points --- letter grades are earned by completing "bundles" of work which increase in size and scope as the letter grade being targeted goes higher. The idea is that students who want a "C" in the course have to do a certain amount of work that meets the specs; those wanting a "B" have to do everything the "C" people do, but more of it and of higher quality and/or difficulty level. Similarly the "A" students do everything the "B" students do plus even greater quantity and quality.

Done right, specs grading allows students choice and agency in how and when they are assessed; students are graded on what they can *eventually* show that they know, and they get to learn from mistakes and build upon failures; their grades are based on actual concrete evidence of learning; and the grades themselves convey actual meaning because they can be traced back to concrete evidence tied to detailed specifications of quality. The instructor often saves time too, because instead of determining how to allocate points (which takes more time than you think), she just determines whether the work is good enough or not, and gives feedback instead.

The specs grading setup I am going to describe here is for Discrete Structures 2, a junior-level mathematics course taken almost exclusively by Computer Science majors. It's the second semester of a year-long sequence and it focuses on mathematical proof and the theory of graphs, relations, and trees. I think that much of the structure I am going to describe here could be ported to other math classes, though.

My overall belief about the course grade in this class (and in others) is that grades should be based on *concrete evidence of student success* in three different areas:

- Mastery of
**basic technical skills**; - Ability to
**apply basic technical skills and concepts to new problems**, both applied and theoretical; and **Engagement**in the course.

Some people may debate whether or not "engagement" ought to be part of the grade. Personal experience with this course tells me that it should, in this case. What I mean here is not just attendance in class, but also preparation for class, active participation during class, and enagagement in the course outside of the class. I want students to treat the course as a high priority and engage with it as such.

If these are the things I want from the course, then I need to set up stuff for students to do and submit to me that will allow me to measure whether or not they are progressing or succeeding in those areas.

For basic technical skills, I combed through the course and decided on a list of 20 basic skills that I felt were essential building-block skills for the course. Those are called **Learning Targets**. These were keyed to the four major topics in the course (proof, graphs, relations, trees). Here are a couple:

P.2:I can identify the predicate being used in a proof by mathematical induction and use it to set up a framework of assumptions and conclusions for an induction proof.

G.6:I can give a valid vertex coloring for a graph and determine a graph's chromatic number.

Here's the full list. Notice these are phrased in terms of concrete action verbs that produce assessable results.

For ability in application of these basic skills, students were given a series of **Challenge Problems**. These are problems that require students to apply what they learned about the basic skills and included a mix of "Theory" problems that involved writing proofs, programming assignments where students had to write Python code to solve a problem, and real-world applications. I started with a core of ten Challenge Problems but also wrote some more during the semester as I got inspired, and we ended up with 17 of these total. Here's one that involved doing some proofs by induction. Another had students write a Python function that would compute the composition of two relations on a finite set. Another had students use Python code to experiment with a class of graphs, make a conjecture about their clustering coefficients, and then prove their conjecture.

Finally, for engagement, I broke from the specs grading mold and used points, or what I called *engagement credits*. Students accumulated engagement credits through the course for doing things like completing their Guided Practice pre-class work on time and participating in class on certain days (especially the days close to breaks). Basically this was a way of incentivizing student work on the course for things that I needed them to do, especially outside of class.

I also had students take a final exam in the course, that I'll describe in the next section.

Each of these three areas above were assessed in pretty different ways.

The Learning Targets were assessed using short quizzes called *Learning Target assessments*. I set aside every other Friday in the course for students to take Learning Target assessments as well as a few extra days during the semester. Here's an example of the assessment for Learning Target P.2 and here's the one for G.6. Notice they are simple tasks that deal directly with the action verb in the Learning Target.

Students could come on these Fridays and take as many or as few of these Learning Target assessments as they wanted. Only the Learning Targets that we'd discussed in class were available, but once they were available they were *always* available. Previously-given Learning Target assessments would have new versions of the same problem available to do. So a student who didn't feel ready to be assessed on Learning Target G.6 didn't have to take the assessment for G.6, but just wait two weeks and try it then.

Learning Target assessments were graded **Satisfactory/Unsatisfactory** according to specifications that I determined, and those specs are at the bottom of each Learning Target assessment so it's very transparent for everyone. If student work was Satisfactory, I just circled the "S" at the top of the page, and circled "U" otherwise.

Challenge Problems were *not* graded Satisfactory/Unsatisfactory but rather using the EMRN rubric, which is a modification of the EMRF rubric I wrote about here.^{[2]} "Pure" specs grading would say that I should not make things so complicated, and just use Satisfactory/Unsatisfactory with a high bar set for Satisfactory. But I've found that in math classes, written work is hard to get right and students get easily discouraged, so I felt there should be some detail added to the 2-level rubric to distinguish between Satisfactory work that is excellent versus merely "good enough", and Unsatisfactory work that is "getting there" versus that which has major shortcomings. Students would submit their Challenge Problems as PDFs or Jupyter notebooks on Blackboard; I'd grade it there and leave feedback, then students could revise (see below).

Importantly, there were no recurring deadlines on Challenge Problems. Instead, students were allowed up to two Challenge Problem submissions per week (Monday--Sunday) which could be two new submissions, a new submission and a revision, or two revisions. The only fixed deadline for Challenge Problems was 11:59pm on the last day of classes, after which no submissions of any kind were accepted. This helps keep students from procrastinating until the end of the semester and dumping a ton of Challenge Problems into the system all at once. (Although there were issues with this; keep reading.)

Engagement credits were given for a variety of tasks, so whenever there was a task given out that could earn engagement credit, I'd just explain what it takes to earn the engagement credit and go from there.

At the end of the course students took a final exam. The final exam consisted of eight randomly selected Learning Target assessments that were given previousy in the course along with a final question for feedback on the course. The Learning Targets were selected so that at least one Learning Target from each of the four main course topics was represented. I'd never given a final exam in a specs grading class before this semester; in past classes, the final exam period was set aside as one more session for any student who needed to pass Learning Targets to have a chance to do so. I instituted the final exam this time because I wasn't satisfied that the combo of Learning Target assessments plus Challenge Problems was giving me reliable data about student learning. I was getting students who would pass a Learning Target early in the course, then forget that they had done so, and "accidentally" retake that Learning Target later... and not pass it. So I wanted to have one additional layer of assessment to get students to recertify on the basic skills at the end of the course. The fact that all I was doing was recycling old Learning Target assessments made this easy to make up, by just randomly selecting the Learning Targets and assessment versions and then merging the PDFs. (I made four different versions for test security.)

I broke again from the specs grading mold and graded the final using points, grading each recycled Learning Target with either 0, 4, 8, or 12 points. A 12-point score was given if the work would have earned Satisfactory marks according to the original specs, 8 if it was "almost Satisfactory", and so on. The feedback questions were given 4 points, bringing the total to an even 100 points.

As in all specs grading courses, almost all student work of consequence could be revised in some way.

Learning Target assessments could be "revised" by retaking them, either at a subsequent Learning Target assessment session on Friday, or by scheduling a 15-minute appointment in the office to do it orally. There was no limit on the number of times students could retake Learning Target assessments, but there *was* a limit on office hours appointments: no more than twice a week, for 15 minutes each, and no more than two Learning Targets attempted per 15-minute session, and appointments had to be scheduled 24 hours in advance. Also, students had to try the Learning Target on paper first before doing it in the office. This was a policy purely to keep the number of office hours visits for Learning Target revisions down to a reasonable level.

For Challenge Problems, students could revise any Challenge Problem that received an "R" or "N" grade just by submitting a new version on Blackboard that addressed the feedback I gave. There were no limitations on the number of times students could revise Challenge Problems other than the two-submissions-per-week rule, and the fact that any Challenge Problem work that earned "N" required students to spend a token to revise.

What's a "token"? A token in specs grading is a sort of "get out of jail free" card that a student can spend to bend the rules of the course a little. Every student in my course started with five tokens. By spending a token, a student could purchase a third submission of a Challenge Problem in a given week (but these couldn't be "stacked", for example to get four submissions in a week for two tokens), to purchase a third 15-minute oral revision session in a week, or to purchase five engagement credits.

There were no revisions available for Guided Practice, the final exam, or any item that earned engagement credits.

The "specs" part of this system so far has come from the Satisfactory/Unsatisfactory rubric and EMRN rubric used for grading Learning Target assessments and Challenge Problems. Most of the engagement credit-earning items were also graded Satisfactory/Unsatisfactory. Specs grading also has to do with the assigning of course grades, and here is how it worked in my course.

First of all, let's distinguish between the **base grade** for the course and the **modified grade**. The base grade is just the A B, C, D, or F that a student earns. The modified grade is the base grade modified up or down by a plus or minus. Course grades were determined by a simple two-step process.

The *base grade* in the course was determined using this table:

To earn this grade: | Accomplish the following: |
---|---|

A | Earn Satisfactory marks on 19 Learning Targets; and complete 10 Challenge Problems with at least an M mark, including at least five "E" marks. |

B | Earn Satisfactory marks on 17 Learning Targets; and complete 7 Challenge Problems with at least an M mark, including at least three "E" marks. |

C | Earn Satisfactory marks on 15 Learning Targets; and complete 5 Challenge Problems with at least an M mark. (No "E" marks required.) |

D | Earn Satisfactory marks on 13 Learning Targets. (No Challenge Problems required.) |

So the base grade in the course is entirely determined by three points of information: (1) how many Learning Targets you pass, (2) how many Challenge Problems you pass, and (3) how many Challenge Problems show excellent work. (An "F" grade is awarded if a student doesn't complete the requirements for a "D".)

The grade of "C" is considered "baseline competency", and to earn that grade you have to complete the "C bundle", which is passing 75% of the Learning Targets and completing five Challenge Problems, with no requirement of excellent/exemplary work required. The "B bundle" is everything in the "C bundle" with more Learning Targets passed and more Challenge Problems completed plus some evidence of excellent/exemplary work. The "A bundle" is likewise everything in the "B bundle" with even more Learning Targets and Challenge Problems completed plus even more extensive evidence of excellent/exemplary work. Notice, students get to choose which Challenge Problems they attempt -- we had 17 Challenge Problems in all and students just picked the ones they liked^{[3]}.

Additionally, students targeting an A or B grade in the course had to complete at least one "theory" oriented Challenge Problems with an E or M grade (i.e. successfully write a proof for a mathematical conjecture) or else the final grade was lowered by one-half letter.

If the only grade we awarded were these five letters, this would be extraordinarily simple. But we also award plus/minus grades, and so I had to add rules into the system for how this works. I chose to approach this by awarding plus/minus modifications on the basis of the final exam and on engagement. The base grade was raised by a plus, lowered by a minus, or lowered by an entire letter as follows:

- Add a
**plus**to the base grade if you earn at least**60**engagement credits*and*earn**at least an 85%**on the final exam. - Add a
**minus**to the base grade if you earn between**30 and 39**engagement credits (inclusively)*or*earn**between 50% and 69%**(inclusively) on the final exam. - Lower the base grade
**one full letter**if you earn**fewer than 30**engagement credits*or*earn**lower than 50%**on the final exam.

So, students' *base grades* are determined by the really important stuff in the class --- basic skills and the ability to apply them. Those base grades earn plus/minus grades by their performance on the final and their engagement in the course. There was some insulation in place in case a student did poorly on the final or had low levels of engagement, but it would still affect their grades. Note that there's a "safe zone" cutoff here of 70% on the final and 40 engagement credits. Students who do this well are immune from their base grade being penalized.

In my view, the whole plus/minus system here fouls up what it otherwise a beautifully simple grading system, but according to the Dean's office I *have* to have some plus/minus system in place. This is the best I could come up with.

To help students keep up with this, they were given two visual aids. First there was this scorecard that they could use to track their progress on their base grade --- just check off or fill in the boxes as the semester progressed. Then, to navigate the plus/minus system, there was this flowchart that got students from their base grade to the final grade in just a few questions.

I haven't gotten back evaluations for this course yet, so I just have verbal feedback and mid-semester surveys to go on. But based on what I have, students were totally thrilled by this system. They remarked about how it took a little while to get used to it, but once they "got it" they wished that all their other courses did the same thing. Computer science majors in particular --- who make a career out of determining how to debug their code from feedback given by the compiler --- really resonate with the idea of being able to debug their math work by using detailed feedback. I've been contacted by at least one other professor in the CS department here who's had students from my course talk about how much they appreciated it and how much it helped them learn.

For my part, I could definitely see students learn in ways that traditional grading, with its one-and-done approach to assessing skills, simply doesn't support. Students' ability with proof especially benefitted. Most of my students had *seen* proof in their first-semester class because that's a required topic, but their proof skills were virtually non-existent in my class.Their first attempts were usually pretty sad. But with feedback and office hours visits, they kept at it, and eventually almost everyone was able to whip at least one induction proof into shape, and could demonstrate skills with proof through Learning Targets P.1--P.4. A few became really fascinated with induction proofs and did several of these on their own free will.

I really liked the *simplicity* of this system as well, with the base grade determined by a simple lookup on a four-line table and the modifications done with another similar table. My past attempts at specs grading were detailed but tended to be like Rube Goldberg machines, sort of bloated and complicated. I was really going for simplicity and minimalism this time, and while again the plus/minus system messes this up somewhat, I was still very happy with the results.

I also liked that this system focused student conversations away from points and the grubbing of points and toward content and knowledge. I never heard anything like, "*I need to earn an 86.2 on the last test in order to bring my average up to a B*". Instead I heard conversations like, "*I need to complete one more Challenge Problem to get a B, should I focus on applications of graph theory or writing code to implement a relation?*" or "*I've taken Learning Target P.3 three times without success and it's because I don't get structural induction --- can we talk about that?*"

Likewise, this system inverts the way students tend to approach the course as a whole, that is by coming to class without a clear idea of what they want out of it and just hoping for the best. Here, students have to think about the grade they want to earn *first*, then this tells them which "bundle" to look at in the syllabus and this lays out an agenda for what they need to accomplish in the course. There is no "hoping for a grade"; the student is in control and we talk about *targeting the grade you wish to earn* instead of hoping for the grade you think you "deserve".

This was the first time I'd tried oral revisions of Learning Targets in the office and I thought that was a great success, especially on proof-related targets where students could just *talk* through the problems rather than writing so much down. I think in some cases I got much better information about student learning by talking face-to-face with them rather than reading their writing.

I also think the final exam was good for providing that extra layer of assessment that allowed me to triangulate my data about student performance from the Learning Targets and the Challenge Problems, and students couldn't just clock out of the course once they'd reached the requirements of their chosen grade bundle.

I continue to believe that doing away with recurring deadlines for Challenge Problems is a good idea and student work is better without them. At the same time, students really struggled with procrastination. Although I had measures in place to help with this^{[4]}, by the end of week 9 in a 14-week semester, the median number of *submissions* of Challenge Problems --- not Challenge Problems passed, but *submitted* --- was *one*. Therefore the vast majority of students were still cramming in Challenge Problems during the last three weeks of class; I received 75 different Challenge Problems to grade over the weekend of week 12 for instance. I eventually dug out from under the grading, but procrastination cost some students a passing grade. So while I'll continue this quota/single deadline system in the future, we all need to take procrastination more seriously.

The final exam helped eliminate false positives in the course, but next time I'm going to make the final contain not only old Learning Targets but also some conceptual questions, to get data on students' conceptual understanding and not just basic technical skill. For example I had some students who could perform Warshall's Algorithm but I am not sure they know what this algorithm does or why it works.

Finally, while I'm convinced this specs grading system isn't more complex than traditional grading systems, it's quite different, and it requires that students really take a consistently hands-on approach to the class by learning the system, reading the syllabus carefully, and paying attention to announcements and calendar events regarding graded work. Unfortunately this is by far my students' weakest link --- managing information streams, projects, calendar events, and tasks. I feel like there should be a course-within-a-course here that includes some basic GTD training and accountability for staying current with course info, because failure on this front absolutely destroyed some students.

To conclude here: I am a convert to specs grading and I do not see myself going back to traditional grading anytime soon. It's just too good. I still hate grading like everybody else, but at least now when I grade, I am giving feedback to students rather than splitting hairs over points, and it's really changed the dynamic of my classes for the better.

For those who are interested, here is the syllabus for the course with all the details (yes, I actually left stuff out in this post).

Now, ask me some questions about this.

My next class isn't until Fall 2018 because of my sabbatical, so I need this post for myself, too, to help me remember how to do this 15 months from now. ↩︎

I changed the "F" to "N" because the letter "F" in my opinion is so emotionally loaded for students that it simply cannot be used in any context for grading any more. I could make the top grade "F" for "Fabulous" rather than "E" for "Exemplary" and students would still think they failed. The letter "F" is done in education. ↩︎

Which in practice often turned out to be the ones they felt would be easiest. Some definitely looked for the easy way out. But most students found they gravitated toward certain kinds of problems and developed a real interest in those topics because

*they picked them*. ↩︎For example I set up "incentive checkpoints" that awarded engagement credit in big chunks for finishing a certain number of Challenge Problems and Learning Targets by certain points in the course, and the only way to earn a plus grade in the course was to hit at least one of those checkpoints. ↩︎

In the few days, two national op-ed pieces about grades and grading in higher education have appeared. Corinne Ruff wrote this piece for the Chronicle (paywall, sorry), and then Mark Oppenheimer wrote this Washington Post op-ed provocatively titled "There's nothing wrong with grade inflation". The fact that these appeared within a few days of each other possibly signals that there is a growing sense that something is wrong with grades in higher education, and it definitely affords opportunity to raise awareness about alternatives like **standards based and specifications grading** (SBSG).

The WaPo piece is about grade inflation, specifically about the pointlessness of trying to combat grade inflation any longer. Oppenheimer points out that all the major efforts to combat grade inflation in the elite schools have ended up causing more problems than they solve. And so, as Oppenheimer says, "Our goal should be ending the centrality of grades altogether. For years, I feared that a world of only A’s would mean the end of meaningful grades; today, I’m certain of it. But what’s so bad about that?" He goes on to point out many of the failings of traditional grades that I've mentioned here: grades promote extrinsic motivation and surface or strategic learning at best, they don't always measure learning accurately, and they don't measure certain important kinds of cognition at all.

Oppenheimer says "We need to move to a post-grading world. Maybe that means a world where there are no grades — or one where, if they remain, we rely more on better kinds of evaluation." He then proposes a system of "nuanced transcripts with comments" and gives several examples of schools taking this path. This proposed system of "transcripts with comments" will remind some readers of this article I posted in September where I proposed basically the exact same thing.

He points out that this "nuanced transcript" approach is being used at elite institutions and small schools, and that this can't necessarily be replicated by larger universities or by contingent faculty who don't have the time or resources for investing hours of time in writing detailed letters for each student's portfolio. His answer to this is that maybe the larger schools can make small steps toward change, for example by abolishing the use of the SAT for admissions and doing *something* about transcripts. To someone who might have been nodding in agreement along with this op-ed up to this point, that conclusion must be disappointing. Isn't there *something* that can be done about grades if you're not tenure-track at a small or elite institution?

Let's cut over to Corinne Ruff's article in the *Chronicle*. The article asks a question (Why do colleges still use grades?) but never seriously attempts to answer it. Instead Ruff, like Oppenheimer, raises the concern that grade inflation is so bad that grades themselves have becoe meaningless. Ruff also mentions a potential fix for this problem in the form of competency-based education as practices by institutions such as Western Governor's University. But like the Oppenheimer piece, Ruff's article ends on a somewhat negative note. Quoting Woodrow Wilson National Fellowship Foundation president Arthur Levine, the article makes meaningful reform of grading in higher ed as something far off in the future:

"This isn’t all going to happen next week," [Levine] says, adding that most institutions still haven’t taken steps to move away from grades. "We’re talking about an evolution over time."

If the situation is so bad, then isn't there *something* that can be done about grades in higher ed that doesn't involve a wholesale revolution in higher education itself that would take decades, and quite frankly isn't likely to happen at all if it's framed as something that requires a revolution?

If you read this blog on any kind of basis you know that my answer to this question is "yes", and that the answer is SBSG. I think SBSG addresses the core concern of both of these articles -- that grades have become or are becoming meaningless -- and implements the actions implied by both of these articles (we need to replace traditional grading with something else) in a way that gives individual instructors and students control over the process, so that the change is closer to the ground and requires only some careful planning and marketing, rather than wholesale revolutionary change.

If grades have become meaningless -- and I think that they are getting to that point, if not already -- then it's because grades have become decoupled from demonstrable student learning. What does a "B" in Calculus actually mean about what a student knows or doesn't know about Calculus? It *might* mean that the student knows considerably more than someone who has a "D" in the course. But beyond that, it's impossible to say. Even if we knew the assessments used in the course and the sorts of work that students were asked to do, it's impossible to say. Without having grades tied to concrete accomplishments of specific learning goals done to clear specifications of professional quality, we simply don't know what a grade means.

What about grade inflation? Oppenheimer suggests that the inflation of grades has caused, or is causing, grades to become meaningless. But it might well be the other way around -- that the meaninglessness of grades, by which I mean the inability to deduce information about learning from the grade itself, could be driving grade inflation. If professors and future employers don't believe that grades have meaning, why *shouldn't* we give students high grades for poor quality work, and let the "real" grade become -- as Oppenheimer suggests -- letters of recommendation and the like? On the other hand, if grades really *did* have meaning, then perhaps we'd be less likely to inflate them and give high grades for poor quality work, both out of a sense of professional ethics and also because the system that delineates what grades mean wouldn't allow it.

This is where SBSG comes in. In SBSG, we have

- Specific learning targets that undergird the whole course that spell out exactly what learning targets students need, eventually, to show proficiency towards.
- Assessments that ask students to demonstrate specific evidence that those learning targets were met.
- High standards of professional quality for what constitutes acceptable evidence on each assessment.
- Opportunities for revision and learning from mistakes, so that the assessments of learning are less prone to false positive or false negatives.
- A course grading system that is tied specifically to the quantity and quality of evidence that students provide of their learning, relative to our targets and standards.

In short, in SBSG, grades *mean* something. When a student earns a B in my discrete structures course, I know what it means: the student demonstrated proficiency on 20 out of 20 learning targets that address core competencies; the student was able to demonstrate additional evidence on five of those 20 targets; that the student completed six short projects throughout the semester that met standards of quality for such work; and that they maintained an 80% completion rate of all course preparation and homework tasks. If needed, I can produce the quality standards and the learning targets themselves. And all of this is spelled out in the course syllabus -- it's not occult knowledge or a subjective opinion. Even if I *wanted* to give high grades for poor or insufficient work, the system itself works against that.

Last semester when I was teaching this discrete structures class, it turned out that around half of my class earned grades of A or A-. I was worried, to be honest. I felt that perhaps I had made the course too easy. But then I went back and looked at each student's track record in the course, and every student who earned those grades did so because of a concrete, specific body of work that they had worked hard to produce over the semester. I could point to specific work that showed that the students had given acceptable evidence -- acceptable on my terms -- that they had satisfied the learning objectives of the course at "A" or "A-" level. If the specifications for acceptable work themselves aren't too lax -- and I felt like they weren't in this case -- then this is not an instance of grade inflation. It's an instance of large-scale student success, something to be celebrated and not stigmatized.

And to reiterate, SBSG is not something that requires a massive systemic change to get started, as would be the case if a university wanted, say, to transition to the "nuanced transcript" system. We don't have to *wait* for our system of higher education to "evolve". SBSG is something that individual instructures and students can begin to use as early as next semester. We keep the usual way of *reporting* grades using the ABCDF system (although I would love to get rid of that, too someday) -- just set up a backend for assessment that makes these letters actually mean something. I like the chances of SBSG being successful in the short term a lot more than those of competency-based or transcript-based "grading" simply because it's simpler, and especially because it's more organic. These kinds of changes are best done from the bottom-up, where it enjoys the support of faculty and especially students.

So perhaps the answer to the problems raised in these articles is right under our noses and is a lot simpler and closer than we think. What do you think?

]]>I haven't yet posted a complete rundown of what I call *specs grading iteration 4* -- the version of specifications grading that I am using in my classes this semester, the fourth semester after first rolling that system out last year. That would be more like an e-book than a post. So I am posting in bits and pieces. In this post I wanted to focus on an aspect of my assignments in my discrete structures course that is connected to the grading system: Namely, how I am handling deadlines for significant, untimed student work.

How I am handling deadlines is that I eliminated them.

Students in the class do three major kinds of work: timed assessments on learning targets, which are done in class; course management items that include guided practice assignments and weekly syllabus quizzes; and what we call *miniprojects*, which are like homework assignments targeted at applications of basic content to new problems. The miniprojects are what this no-deadline policy targets.

Miniprojects are significant assignments that are challenging in nature, graded using the EMRF rubric. Students are allowed to revise work that isn't "passing" (E or M grades) as well as to attempt to push "M" work up to "E" level. In fact students should *expect* to have to revise their work on these since an "M" is not always easy to get. I am planning on writing 10-12 of these for the semester and students have to pass 8 of them, including at least 2 "E" grades, to get an "A" in the course. (The full grading system is here in the syllabus starting on page 5.)

I used to have hard deadlines on these. In fact the first two this semester I assigned had hard deadlines. Students could spend a token to get a 24-hour extension on that deadline (and up to three tokens to get up to 72 hours of extension) but those deadlines were fixed. About two weeks into the semester, however, I decided that deadlines were not in harmony with the spirit of specs grading. More on this below. So I replaced the deadline policy with this:

- Students are allowed to submit
**up to two miniproject-related submissions per week**(= Monday through Sunday). This can be two first submissions, a first submission and a revision, or two revisions. Their choice. **No submissions are accepted past 11:59pm EST on Friday, April 22**(the last day of classes).

I'm calling this the **quota/single deadline** system. Students get the freedom to choose what they submit on a weekly basis; and they cannot put it all off until the end of the semester because they can only submit up to two items a week, and there is a fixed no-exceptions single deadline for the whole semester.

Why did I do away with fixed deadlines and replace them with this?

- I don't think it's true that having to work with fixed deadlines on every assignment promotes the kind of behavior some people think it promotes. I've often heard the line that
*having to work with deadlines prepares you for the working world.*After being in the working world for 20+ years, I think the value of deadlines as a means of personal growth is vastly overrated. When I look at my own work, the majority of the tasks that I need to do -- and I have hundreds of them at any given moment -- either have no deadlines at all, or the deadlines are self-imposed or can be re-negotiated if needed. And somehow, I learned to be a responsible adult anyway. I tend to think that this happened not because I was compliant, but because I had*freedom to choose my work*within reasonable guidelines. I changed the deadline structure on these assignments in my class because I want students to experience how cool and empowering it is to be invested in one's own work for a class, just like I learned it, by being given some freedom to study what they want and do it on something resembling their own schedule. - I also disagree with the notion that
*having to work with deadlines teaches responsibility and self-motivation*. If I complete a task against my will because there is a deadline attached, that*might*be considered "being responsible" but it is most certainly not being "self-motivated". It's sort of the*opposite*of being self-motivated, specifically being extrinsically motivated. Self-motivation -- or more importantly, self-regulation -- requires some kind of individual agency and a sense of self-efficacy. Getting the work done needs to be the student's idea. - I also came to realize that most of the complaints that I was getting from students -- and I get several of them every time the semester starts -- had nothing to do with the class but rather were expressions of frustration and stress that were amplified by the presence of deadlines. Sometimes putting boundaries around tasks creates some productive energy. I do this myself sometimes by self-imposing deadlines on projects that have gotten stuck in neutral. But other times -- quite often, when you're a student working two jobs and commuting an hour to and from campus and carrying 16-18 credits of courses -- deadlines just cause stress that is completely _un_productive.
- Finally I realized that having fixed deadlines on the assignments goes against the flipped learning design that I employ in the class. According to the "F" in the four pillars of FLIP, a good flipped learning environment is a
*flexible*learning environment in a number of senses, including the flexibility to choose what and how and when you learn something if you're a student, within reason and within the instructors framework for the course. Additionally the whole point of having specs grading is to give students choices on when and how they are assessed, and fixed deadlines don't work in harmony with that idea.

So far the results have been great. Far from procrastinating, students have been very productive. I've been getting about 30-40 submissions a week from 60 students total. Many of them do the math and realize that they need to maintain forward motion on getting things done so as not to wind up in an untenable position at the end. Also, since no single miniproject is required -- they just have to pick from among the ones that are posted -- the students' investment and energy level on these has really improved. (They still have to pass a sequence of timed assessments on the core learning targets of the course, so there's no worry that by not choosing a particular miniproject that they'll miss out on demonstrating mastery on something.) I also have stopped getting those panicky emails at 11:58pm about SageMath Cloud or Blackboard not working. Everybody's stress level has dropped.

So the freedom they get to choose their work and their work schedule has made them exactly what deadlines did not make them: happy, productive, and interested in the material. Maybe deadlines are necessary on some level but I would caution against giving them too much credit for students' development.

]]>In a previous post, I wrote about the EMRF rubric and how I am using it right now in my classes, which are using specifications grading. Here I want to discuss a few instances of how I've used it so far, and the kinds of effects this rubric has had on the narrative within, and about those classes.

In one of my classes (Cryptography and Privacy) one of the learning targets is

I can find the value of $a \pmod n$ for any integer $a$ and positive integer $n$ and perform modular arithmetic using addition and multiplication.

The problem they get consists of eight basic modular arithmetic computations to do involving adding, multiplying, and exponentiating. Unlike most work I give students to do, I don't especially care if they show their work on this problem. All I care about is whether they can compute the answer correctly or not. So the EMRF rubric is simply:

- E = All eight answers are correct.
- M = Either 6 or 7 out of 8 are correct.
- R = All parts are attempted but fewer than 6 answers are correct.
- F = Not all parts are attempted.

The idea being that if students can do modular arithmetic correctly 8 times in a row in a single sitting, that's pretty exemplary and I am relatively certain they have mastered the concept. If they can do it correctly about 3/4 of the time then I consider that "good enough". Otherwise they need to practice some more and try again later.

The EMRF rubric gets a little more interesting when applying it to more complex work, like mathematical proofs. I have an activity I like to do in proof based courses where students "grade" a piece of work that I present to them. Students read through a proposition and proposed proof and then use clickers to rate the work according to the specifications for student work document.

Students were given a pre-class activity with an induction proof where the base case was left out. In the pre-class activity only 65% of students correctly identified that the proposition was true but that it had this major flaw. Most of the remaining 35% said the proof had only minor errors to be corrected that had to do with style and phrasing. In class, I put this proof up on the screen and asked students what grade they would give it, if they were me. About 1/3 of the students said either E or M, about 1/3 R, and about 1/3 F. We had a really interesting discussion then about what constitutes passing versus non-passing work, and what differentiates E from M. Once students saw the missing base case, all the E/M people switched to F! Students are much harsher graders than I am.

For me, this is an F grade because it's fragmentary. (One could make a good argument for R, though.) A nice teaching moment in the discussion was that **the grade of F does not mean catastrophic failure. It means "fragmentary"** and many times, work graded at an F is five minutes and two sentences away from E, which is the case for this proof.

In my discrete structures course (same class as Case 2) we have this learning target on which students have to demonstrate proficiency:

I can outline and analyze a mathematical proof using weak induction, strong induction, and structural induction.

(There's another learning target where they have to actually *write* a proof.) Here is one of the problems from a recent assessment on this target:

Consider the following proposition:Every positive integer greater than 1 is divisible by at least one prime number.Assuming we prove this using strong induction, write clear statements of the base case, induction hypothesis, and inductive step.

Here are some of the less-than-E responses and how I graded them with the rubric:

- A student used $n = 1$ as the base case and showed that since $1$ is divisible by itself and $1$ is prime, then the base case holds. The rest of the proof outline went off without errors.
**In this case I gave the student an M.**Does the work meet expectations?**Yes**: The student has provided evidence that they understand the most important aspects of strong induction. Is it complete and well-communicated?**No**: The base case is wrong. This is not a trivial error, otherwise this would be an E. But it's an error that can be corrected through written comments. If the student really wants to earn an E on this target, they can always take the assessment again later and get the base case right. - A student set up the correct base case. For the inductive hypothesis, the student assumed that for some positive integer $k$, $k$ is divisible by a prime number; then stated that we want to prove that $k+1$ is divisible by a prime number.
**In this case I gave the work an R.**Does the work meet expectations?**No:**The whole point of strong versus weak induction is that the inductive hypothesis is different, and work that doesn't demonstrate evidence that they understand this, has not met expectations. Is there evidence of partial understanding?**Yes:**The rest of the outline is fine. The student just needs to try it again. - A student set up the right base case. For the inductive hypothesis, the student said that we will assume that $2, 3, 4, \dots, k$ are divisible by a prime number
*for all positive integers $k$*; then stated that we want to prove that $k+1$ is divisible by a prime number.**In this case I gave the work an R.**Does the work meet expectations?**No:**Outlines of induction proofs are expected to show understanding of the basic logic underlying the concept of induction, and getting the quantifier wrong in the inductive hypothesis casts doubt on that understanding. It's a significant logical error in which the proof assumes the conclusion. But, is there evidence of partial understanding?**Yes:**The rest of the proof is fine. The student just needs to try it again and get the quantifier right.

One of the many things I like about this rubric and the process of continuous revision that it feeds is that the assessment process is now, in the words of Dee Fink, "educative" rather than "auditive". That is, the assessment process is helping students *learn* mathematics rather than simply telling them what they don't know.

It also saves a ton of time. In all these cases, the decision of what grade to attach took me less than 90 seconds, including the time it took to actually read the work. In a traditional grading setting, I would have to not only spot the error but then go on to agonize over whether a messed-up base case should get 10 out of 12 or 11 out of 12 or... And then repeat for the other two cases, and I can guarantee that would take an order or magnitude longer. It left me with enough time to write down meaningful feedback in complete sentences.

Another way this saves time is that I very rarely get work that evaluates to an F. Since students can redo assessments through the semester, if they find themselves taking an assessment and can't produce a coherent finished project, they just bail out, and the work never gets turned in. And that's exactly what they should do, and make plans to try again next time.

Yesterday I handed back some work on assessments, and a student pulled me aside and asked if he could argue for a higher grade. He had done work on a problem (solving a recurrence relation) where he made an initial algebra error that was serious, but then worked through the rest of the procedure correctly. I had given him an R; he was arguing for an M.

Sounds familiar, except this time the student was not grubbing for points -- what's a "point"? -- but rather presenting a coherent and well-considered explanation for why, in his opinion, his work meets the standards for M. It was an exchange between two people on the same level. I did not agree with his argument in the end -- the standards for this problem clearly state that you have to get the answer right in order to be eligible for a passing mark -- but after telling the student this, I could also say, "You definitely show evidence of understanding. You'll get this right next time." It wasn't about *points*, it was about *quality* and there's a world of difference here.

I suspect some of you reading my evaluations above are disagreeing with me, but probably the disagreement is on *quality* issues (standards), not *quantity* issues (points). That's a narrative that I want to support.

A little over one year ago, I make a decisive break with traditional percentage-based grading systems and embraced specifications grading. I was motivated by experiences in my calculus classes where, after over 20 years of using traditional grading, I was finally fed up with the way it gives false positives and false negatives, stresses students out, and disadvantages students who need more flexibility and more choices to show evidence of learning.

That first implementation in January 2015, which I call "Iteration 1", had lots of bugs, but it was still the best experience I'd ever had with grading up to that point. Iteration 2 was for an online course, and although I used specs grading it was so different due to being online that it's hard to compare it with Iteration 1. Iteration 3 was in the fall, and as I noted here before it was worse than either Iterations 1 or 2 because I made the system too complicated. What I want to talk about now is Iteration 4, which is what I am currently using and I think it's the closest I've gotten so far to the ideal grading experience for both myself and for students.

In this post I want to focus on something I stumbled across that makes Iteration 4 work so far, and it's the grading rubric I am using for the major pieces of student work. It's known as the **EMRF rubric** and it's due to Rodney Stutzman and Kimberly Race. It came up in a discussion on the Standards-Based/Specifications Grading community on Google+ (which all the cool kids have joined, so get on that) and once I saw it, I knew that this was the rubric I had been looking for.

I've learned that in this style of grading, **it really pays to have a simple, visual standard grading rubric for all major assignments.** If you have one, then it's helpful for students because it provides transparency and a way for students to self-evaluate; and for you, it provides some measure of consistency in grading without having to agonize about giving the same number of points for similar work. Classic Linda Nilson-style specs grading uses a two-level rubric -- Pass/Fail. I instituted a three-level rubric for grading proofs and other complex problem solving tasks -- Pass/Progressing/Fail. (I changed "Fail" to "No Pass".) The middle layer of the rubric is for work that doesn't quite meet the specifications I set out, but it's close, and pragmatically the difference is that students have to spend a token to revise and resubmit "No Pass" work but they can revise "Progressing" work for free.

What I found was missing from this three-level rubric was a designation for really excellent work. The "Pass" level was synonymous with "good enough", and in my courses this is exactly what I was getting -- "good enough". There wasn't much incentive for excellence. Yes, I could set the bar very high for "good enough" and call that "Passing", but it always felt to me that there needed to be something like "Pass+" in my system, and then the requirements to earn an A in the course would require a certain number of instances of really excellent work.

The EMRF rubric does this. It is basically a Pass/Fail rubric with one extra layer on each side. In my specs grading system in Iteration 4, anything that registers as E or M is considered "Pass", and I have laid out in mind-numbing detail in a specifications document what it takes to attain these levels. Likewise anything that grades out to R or F is considered "No Pass" or (as I prefer) "Not Yet" or "Try Again".

In my Discrete Structures courses, students work on three kinds of assignments -- **Assessments** (short timed in-class quizzes that measure proficiency on a single one of 20 different learning targets for the semester), **Miniprojects** (which apply basic knowledge to new problems), and **Course Management** tasks that include preparation activities, daily homework, and weekly quizzes over the syllabus. Assessments and Miniprojects are graded on the EMRF rubric while course management tasks are graded Pass/Fail, usually on the basis of just completeness and effort. To get an "A" in the class, students must:

- Earn "Pass" grades (E or M) on all 20 learning target assessments, then provide a second item of evidence of proficiency for 10 of those 20 learning targets, and earn at least five grades of E in the process.
- Earn "Pass" grades (E or M) on 8 Miniprojects (out of 10--15 in all) including at least two grades of E.
- Pass at least 90% of all the course management tasks.

The "second item of evidence of proficiency" can be taking a timed assessment a second time and passing it, or doing an oral assessment in the office during office hours, or making a case that the work on a Miniproject shows evidence of proficiency.

For a "B", students need to Pass all 20 assessments, provide secondary evidence for 5 learning targets, and earn at least two E grades in the process. They must also Pass 6 Miniprojects and earn at least one E grade; and pass 80% of the course management tasks. For "C", students have to Pass all 20 assessments, and provide secondary evidence on at least 3 of these and there is no requirement for E grades. And "C" students also must Pass 4 miniprojects, again with no quota for E grades, and pass 60% of the course management tasks. There are also contingencies for D and F grades and a set of rules for determining plus/minus grades.

This, I think, captures what I want from grading -- students choose the grade they want to earn and that grade sets the agenda for their work in the class. Baseline proficiency is considered to be showing one piece of evidence that you are proficient with all the major skills objectives in the course, turning in about 1/3 of the available miniprojects assigned, and giving a reasonable amount of attention to course management. That's a "C". To get higher than a C, you have to demonstrate work that shows both more depth (secondary evidence of proficiency on learning targets), more breadth (more miniprojects), and some evidence of true excellence (the "E" quotas).

The difference between EMRF and straight Pass/Fail is the kind of feedback the letter communicates in EMRF. A grade of M means *This meets expectations* but it also honestly commnicates that it could still be better. For many students who get an M on an assignment, the existence of an E will impel them to retry an assignment to raise their grade even though it was "good enough" the first time. Likewise, a grade of R or F *means* the same thing in the grading system -- you still have to revise and resubmit if you want the work to be counted -- but it *communicates* two different things, an "R" saying *There is partial understanding here, but something important is missing* and an "F" says *There was too much missing to really know if you understand*.

I do not like the letter "F" in this rubric, though, because of the emotional baggage attached to it. People assume it means "fail", and it sort of *does* mean that, but then students too often see failure with such negativity and finality that they miss the message that they can try again. I would probably rebrand that level, maybe "I" for "Incomplete" or "S" for "Significantly flawed".

In the next post I'll show some examples of how I've graded with this rubric and how we've included it in some class discussions about the quality of work and professional standards.

]]>*Happy New Year everybody. This post is the first one I've made since November 2015, but I am making an effort to get back on the wagon and write here more often (like a lot of guilty bloggers are possibly doing). So here we go.*

Tomorrow (January 11) our new semester kicks off. Confession: I am not good with first-day activities. I don't enjoy icebreakers -- didn't like them as a student, don't like them as a professor. At the same time I don't like launching right into the course material on the first day because enrollments tend to remain in flux for a week or so, and I don't like putting new people behind at the outset. My solution for the last year or so is to use a variation on Dana Ernst's "Setting the Stage" presentation, which gets students thinking big-picture on the first day really effectively, and to gather some personal information about students that helps me get to know them better.

This time around I am doing the latter via a Google Form survey that I want students to do before day 2. I've done this before in the past, but this time somehow it turned out differently, because I've been using and thinking about specifications grading for a full year now. I want students to think about their grades on the first day -- to begin with the end in mind, as they say.

Actually I would rather students not think about grades at all. But until we get rid of grades entirely, their mind-altering influence will persist among students and faculty alike, and so insofar as students think about grades I would like for them to think of grades as *goals*. I don't want them to think of grades in terms of "hope" -- as in, *I really hope I get at least a B in the class* -- but rather as the outcomes. Not the outcome of random processes, in which students are like ancient pagans sacrificing time and energy to the Grade Gods (i.e. professors) in hopes of a good harvest. Instead, these should be the outcomes of reasonable goal-setting, careful planning and personal management, and of concrete evidence of learning. There should be no need for "hope" to be involved.

Among other things on this survey, students respond to the following three items. First, this:

At the beginning of each semester I like to ask each student to set a goal for the grade they would like to earn. Please don't say "A" just because that's what you think you're supposed to say. Many students drive themselves crazy because they think they are supposed to earn A's in everything when actually a "B" is perfectly suited for their goals and far more reasonable. Think carefully about how far you want to go in the course: Think about your personal interests, your academic goals, your intervening work and life responsibilities, and your skill set and set a goal for a grade that is realistic and attainable, whether that's an "A" or a "C". Take 5 minutes to think it through. Once you are done: Check the grade that represents what you think is the most realistic, reasonable, and attainable grade for you given all the factors you considered. Whatever you choose (as long as it's passing!) I will support.

There's a pulldown menu below this item with the grades A through C- on it. Next, they answer:

Now explain your reasoning behind the grade you chose.

There's a paragraph to enter text below it. Finally there's this:

Now go to the syllabus and take a 3x5" notecard, and write down all the coursework you need to complete in order to earn the grade you chose. I may ask to see this in class. This card is important -- you can use it at any point in the course to check against your grade records to see how much further you need to go. Go and do that now.

Students are supposed to click "OK" once they read this. And yes, I do intend on spot-checking people's cards -- whenever a student later in the semester wants to come and talk about his or her grade in the course, I will tell them to make sure to bring their card along.

I've had students do this in the past but only informally. This time I really want every student to *start* with the grade and then work *towards* it, rather than work like crazy to ace everything and then "hope for the best". My experience has been that most students still select "A" and most of the time this is just a reflexive action in my opinion. But there are some students who have never been given permission to aim for less than an A in a course, even though their life situation and skill set make earning an A an uphill battle that they are likely to lose. And it's very freeing for those students to have the prof say: *If a B is the best you can do in the course for whatever reason, that's OK, and I will have your back the whole way as you earn it.*

It seems to me that we have a problem in higher education with not setting our own goals. We are constantly trying to attain goals that we didn't set. Students deal with this because professors or parents or programs often insist on only the highest levels of attainment even when this doesn't necessarily make sense. And we faculty often have to deal with it as well through tenure and promotion goals that, in some places, are wildly optimistic and totally opposed to the natural skill sets and interests that faculty have. Very rarely have I seen universities where faculty are allowed to set their own goals for teaching, scholarship, and service within a reasonable and broad framework. It's so much like our ingrained system of grading that you realize that the apple doesn't fall too far from the tree.

What's hopeful for me is that standards-based and specification grading sets up a natural structure for students to participate in their grades in a healthy and proactive way, where they are in control and they get to decide what they want from the course. To some extend traditional grading *might* be amenable to this, but that seems to be the exception.

Also it's always interesting to see what students say for their rationale -- and a good point of reference for how to work with those students in the course.

]]>