Office Hours: Faculty on Grading Practices and Grade Inflation

Editor’s note: For this regular column, we email the entire Swarthmore faculty asking the listed question by Saturday morning. We publish every contribution that we have received as of publication on Wednesday evening online, but sometimes must cut responses in print due to spatial constraints. Each contribution is edited for clarity and syntax only.

As part of our regular opinions series, “Office Hours,” we aim to feature a range of faculty voices on higher education and specific questions relating to Swarthmore College. We believe that students, staff, and other faculty could greatly benefit from reading professors’ diverse perspectives which many in the community may not have ever considered. In our second edition of this column, we asked professors to share their thoughts on the following question:

In recent years, there has been increasing concern among faculty and students about grade inflation at Swarthmore and across academia. Do you think such grade inflation is happening? If so, are you concerned, and what do you think are the underlying reasons behind this trend? If you see grade inflation as a problem, what steps should professors take to address it?

Here are their responses:

Alba Newmann Holmes, Assistant Professor of English and Director of the Writing Associates Program

Rather than talking about grade inflation, I wish we could talk about why we have grades at all. I know from teaching CR/NC semesters that some students feel they will only try their hardest if a grade is on the line; so for them, the grades offer a form of motivation or accountability. But for many students, I feel focusing on grades actually detracts from their ability to get outside of their comfort zone, take intellectual risks, or feel a sense of belonging at the school to which they have been admitted. I don’t feel as though ranking and ordering people is integral to their learning; that’s a management system. In my classes, if you are unsatisfied with a grade you receive on a paper, you can revise it again. Is your grade “inflated” because you put more effort into it? I don’t think so.

If we are going to grade, then I want that system to reflect what I value in terms of student learning, and I value their engagement with process. (Plus, in all my years of offering this option to revise again, I can only recall one paper that was actually worse after the additional revision. When I apologetically returned it without an improved grade, the student told me, “That’s okay. I forgot I had the option to revise, and made a few changes in the fifteen minutes before class.” And then I felt affirmed, because I could tell.) It may be a lot to ask, I know, but I want students to partner with me intellectually and to have a sense of shared endeavor, and I don’t feel as though grading is the best way to foster that.

Ben Mitchell, Visiting Assistant Professor of Computer Science:

Grades are, fundamentally, a mess.

As a system, they are an attempt to measure several things at the same time, even though those things are not always compatible. These measurements are then supposed to be used in several incompatible ways at the same time. Oh, and as a bonus, the things we really want to measure are fundamentally unmeasurable, so we’re left with a set of approximations and stand-ins that have a whole bunch of known flaws.

So, grades are a mess. We still use them, because the signals they provide still have some value, and because we haven’t come up with anything better. “Grade inflation” is one of the many problems that are a result of this mess, though I would argue it is not actually the most important one. Still, to understand why grade inflation happens, and to what degree we should worry about it, we need to start by understanding “grades.”

Fundamentally, there are three groups that grades are intended to provide a signal to: the person being graded (e.g. a student), the person doing the grading (e.g. a teacher), and the third party (e.g. an academic advisor).

For the person being graded, the purpose of a grade is to help them understand what they do and do not know, as well as to calibrate their self-knowledge. In other words, how accurately can I assess what I know? If I think I understand a given concept, how likely is it that I actually understand that concept? See the Dunning-Kreuger Effect for why this is important. Clear and helpful feedback has an important role to play in allowing students to learn efficiently and gain confidence in their own skill and knowledge.

For the person doing the grading, grades help in understanding which students are struggling and might need more help. Grades also have the role of helping an instructor assess their own teaching. For instance, if everyone gets a particular question wrong on an exam, it’s typically a sign that the teacher has made a mistake (either through a failure to teach the concept well or through a failure to write the exam question well). This typically requires looking for patterns of mistakes, rather than focusing too much on specific mistakes of individual students, which may have other causes. Still, assessing teaching is an important purpose that grades are intended to facilitate.

For the interested third party, grades are intended to signal the degree of knowledge/skill that a student has in some particular subject. For instance, does a given student have sufficient mastery of the material from Calculus 1 to be prepared to take Calculus 2? Does a candidate for a job have the skills needed to actually do that job?

These groups are not all looking for the same type of information, and it is frequently the case that modifications to the assessment system intended to help with one purpose end up as a hindrance to one or both of the others.

In addition to these different stakeholders, there are different things we might want to measure; the three most common in academia are: meeting a standard, demonstrating mastery, and showing growth/improvement.

Meeting a standard is, in some ways, the easiest to measure. At least, if we assume that the standard itself is correct, sufficient, fully specified, and amenable to direct assessment (spoiler: in practice, this is never really true). In this type of assessment, the standard is typically viewed as a minimal bar that all students are expected to be able to achieve (again, in practice this is unlikely unless the bar is so low as to make the assessment rather pointless). From this view point, a reasonably competent student with an appropriate background who is willing to put in the necessary effort should always be able to demonstrate their ability to meet the standard. This is one of the major factors in driving “grade inflation,” since if grades are based on this type of standard, it may actually be reasonable for most students to get “full marks.”

This view is contrasted with assessment of mastery, which is typically defined as the point at which someone has learned more or less all there is to learn. This does not mean that no improvement is possible, since even a true master continues practicing and slowly refining their skills; rather, it means that there is no longer anything left to teach, and that remaining refinement can be done independently. In some cases, mastery can also be taken to mean that the material is understood well enough that the learner can now turn around and become a teacher for this content. In this view, “grade inflation” is a serious problem, since a high grade suggests that a student has nothing left to learn, and achieving true mastery is typically expected to be something few, if any, students manage. In other words, getting “full marks” in this setting would require going far above and beyond the basic “expectations.”

The third thing we might want to measure is growth, or how well a student is improving. Fundamentally, growth is what a teacher endeavors to accomplish; if a student already knows everything before taking a course, then the teacher can take no credit for that student’s success. Additionally, nobody is ever truly “done” with learning; academics can take us only so far, and there will always be important things that can only be learned “on the job.” On top of this, since not every student enters a course with the same background, measuring growth gives a more “fair” assessment of what that student has actually learned. It is important to note that this is not merely measuring “effort,” though that may be a part of it; rather, it is an attempt to measure what is achieved through that effort. Here, again, what “full marks” means is rather different than in either of the two previous cases.

And even with all this confusion, matters still get worse when we try to actually measure these things, because they are all fundamentally things which cannot be directly measured. Knowledge? Skill? Understanding? These are words intended to describe the internal state of someone’s mind. This is not something that is directly observable. As a result, we measure other things that we use as proxies: the ability to solve problems on an exam, write about ideas in a paper, etc.

The difficulty is that all of these real-world assessments are not just measuring the things we care about, they are also measuring a whole host of other things. In practice, a score on a calculus exam measures not just a student’s calculus knowledge, but also how fast they can do calculus, how well they perform under pressure, their general test-taking and study skills, their ability to read and follow directions, their ability to understand the instructor’s intent with the questions, their level of anxiety, things like dyslexia/dyscalculia, stereotype threat, how well they slept the night before, what other stressors or distractions are on their mind, and what they ate for lunch. And this is by no means a comprehensive list. We may care about several things on this list (e.g. general study skills may actually be valuable, as can the ability to read and follow directions), but even then the inability to disentangle these things from each other is an obstacle, not to mention all the other things that we really don’t want to be measuring.

Different types of assessment may measure different sets of things-we-don’t-actually-want-to-measure; e.g. written problem sets remove some of these issues, but introduce a new set of issues in their place. Many instructors try to combine the results of several different types of assessment in an attempt to reduce the impact of these outside factors on the final composite score, but there are real limits to how far this approach can take us.

On the whole, this matrix of conflicting goals, interests, targets, and flawed measurements mean that trying to interpret grades needs to be done with great care, and a healthy scepticism lest we read too much importance into them. While grade inflation, and the resistance to it, is one result of this mess, it is not the most important one.

No, the real problem arises when the various stakeholders (be they students, parents, teachers, or anyone else) start viewing grades as being actually important. This results in all sorts of unfortunate behaviors designed to maximize grades, instead of maximizing the things we should actually care about like understanding, skill, knowledge, critical thinking, etc.

For me, the basic definition of “cheating” is, “something that improves the measured value without improving the thing it is intended to measure.” However, this definition only makes sense if you understand that the thing we’re trying to measure is fundamentally different from the actual measured value. If you really think that a high score on an exam or assignment is your true goal, then using whatever means are available to maximize that value is perfectly rational behavior. Why would we say that some means of getting a high score are good, while others are bad, if the high score itself was the ultimate end goal?

Another major pitfall is that it becomes very easy to put more emphasis on the importance of things that are easy to measure. Again, this has many unfortunate consequences, but notable among them is the de-emphasis of the arts and humanities in American schools as a result of the No Child Left Behind Act that tied outcomes for schools to test scores; after all, how does one come up with an objective metric to assess art or music? If the score is what matters and we don’t know how to measure a score, then suddenly it appears that the thing itself does not matter. This logic is obviously fallacious, because it starts from a false premise, but far too many people seem to have been convinced by it.

So, in conclusion, grades are a mess, and you really shouldn’t put too much emphasis on them. They’re not completely without value, so get what use you can out of them, but ultimately you should be more focused on the things that actually matter, even if grades are an imperfect measure of those things. Should you be worried about grade inflation? Sure, but maybe worry about the more serious problems around grading first.

K. David Harrison, Professor of Linguistics and Cognitive Science:

Your framing of the question reveals an assumption that grade inflation at elite universities may signal a decline in academic rigor. Yet, as progressive educators like Alfie Kohn have persuasively argued, this misses the deeper point that grades are never neutral instruments. A grade’s meaning shifts with institutional context, cultural expectations, and historical period. An “A” is little more than a social signal, negotiated between faculty, students, and the prestige of the institution. From my perspective as a Swarthmore faculty member, if grade inflation erodes faith in grades, that may be an unintended gift. It may weaken our reliance on grades as gatekeeping devices. Many of us believe that authentic learning resists quantification. Just as ethnographers prefer thick description over reductionist coding, educators should prefer richer narratives of student progress over grades. Can we create the intellectual space to experiment with alternatives that honor curiosity and creativity? An education measured by authentic engagement, rather than grade point averages, would better reflect the diverse ways humans learn and thrive.

Linda Huber, Aydelotte Foundation Technology and Social Justice Postdoctoral Fellow:

In general I feel that the idea of “grade inflation” reflects a scarcity mindset: the idea that only the top-performing students should be receiving As. This, in turn, seems to reflect the social function of grading as a mechanism to aid future employers in ranking and sorting out the “best” students: high grades often act as a proxy for measuring those who are most capable of meeting deadlines or those who benefit from an accumulation of educational privilege. This sorting and ranking function seems fundamentally disconnected from the actual work of assessing students and providing feedback on their progress.

I am a big fan of historian David Noble’s thoughts on this topic: “When skeptical colleagues protest that it is not fair for me to give the same grade both to people who work hard and to people who fail even to show up, I remind them that these people are not getting the same reward because the people who work hard also get an education. ‘Oh, yeah,’ they say, remembering as an afterthought what should be at the forefront of their profession.” You can find more of his thoughts on this topic here: https://www.policyalternatives.ca/news-research/may-2007-giving-up-the-grade/.

Mark Wallace, James Hormel Chair of Social Justice and Professor of Religion:

I think grade inflation is a question particular to the problem of grades in general. My preference would be not to give grades. In my judgment, grades undermine the joy of learning as an end in itself. Grades depress initiative, encourage rote memorization, and set up a false equivalence between academic success and high marks. Particularly in the humanities, student work should be judged by faculty giving personalized responses to papers, exams, etc. instead of letter grades. Be this as it may be, we live in a society where graduate schools and prospective employers require graded transcripts and GPAs for graduates. Sadly, not awarding grades is not an option. But, in an ideal world, grades would drop off and written evaluations would become the standard for assessing student development.

Peter Baumann, Charles and Harriett Cox McDowell Professor of Philosophy and Religion:

One sometimes hears that decades ago the most common grade at Swarthmore College was a C+. Let’s assume this is true. Things are different today. It seems that the most common grade (if there is something like that) or the average grade is much higher these days. Does this mean that grade “inflation” has occurred at Swarthmore College? Can students nowadays “purchase” the same grades as decades before spending less scholarly “money” and better grades as decades before spending the same amount of scholarly “money”? (Shouldn’t this rather be called “grade deflation” then?) But perhaps student work has gotten better over the decades? It is not easy at all to determine this.

But suppose there has been grade inflation (or deflation). Would this be a bad thing? Why? Why believe that faculty were better at grading 60 years ago than we are today? Perhaps past faculty got it wrong and we’re getting it right (or closer to right), rather than the other way around? Perhaps there has not even been any inflation (deflation) in the first place but the meaning of “C+”, “A”, “B-” etc. has changed? Perhaps a 1960s “C+” means, roughly, what a 2020s “A-” means? If that should be the case, then not much would have changed after all and there would be little reason for concern. I don’t know much about grade in- or deflation but I worry more about other things. I want (like everyone else, I guess) to be fair in my grading and to weigh the different aspects relevant to grading against each other in an acceptable way. This can make grading tough. Sometimes I wonder whether we should have a different way of grading. But what could that look like? Or would we be better off without grades, strengthening intrinsic motivation?

Sa’ed Atshan, Associate Professor of Peace and Conflict Studies:

Over the course of my career, having now taught over 1,500 students, I have only assigned an A+ grade to two students thus far. Yet I have noticed the rapid rise of A+ grades across the College over the past few years. This is concerning. I hope that we can reserve such a designation for only the most extraordinary cases of student academic achievement.

Sibelan Forrester, Susan W. Lippincott Professor of Modern and Classical Languages:

I have two thoughts. First, there are some subjects where you really can compare students’ performance objectively. In a language class, for example, a student either knows the words or they don’t; either they’ve learned how the grammar works or they haven’t, probably like math or other STEM subjects. There’s a right answer or, in the case of a language class, a bunch of right answers, but also a bunch of wrong answers. In classes of that kind, I imagine, you’re less likely to get anything you’d call “grade inflation.”

Second, if all the students who take a class are doing really good work, do we really want to grade them on a forced curve? Perhaps they have “self-selected,” and only confident, thoughtful, hard-working people are taking that class. I’ve given students grades of C, or even F, but they had to earn it!

Syon Bhanot, Associate Professor of Economics:

I can say with certainty that grade inflation has been happening over time at Swarthmore, as I have personally seen aggregated data over time. However, it is worth noting that average grades “peaked” a few years ago during COVID — average grades have gone down slightly since then, but remain near that historical high mark. My personal opinion is that the average grades at present are extremely high relative to what I feel they should be.

A churlish response for why grade inflation has happened is “because professors are choosing to give students higher grades.” But that is the reason. This has been a trend in both higher education and in secondary schools. Of course, it is possible that students are simply getting better over time, on average, but other data on educational performance does not support that. Personally, I have sensed a slight dip in student performance at Swarthmore in recent years (and know many faculty feel the same, not just me). One of several pernicious effects of grade inflation that I worry about, in that context, is that we are increasingly misleading students about what their strengths are (and are not).

Overall, I think many interrelated factors are responsible for grade inflation at Swarthmore, including (but not limited to):

1) COVID: I think COVID was a moment when essentially everyone was struggling with mental health. I think many faculty at Swarthmore (and around the country) felt sympathy for the students in that moment and acted on that by becoming more lenient graders. And old habits die hard.

2) Avoidance and Incentives: Students do not like getting low grades, and they are increasingly less used to getting them prior to college. So for a faculty member, giving out high grades is a way to avoid conflict — with students, administrators, parents, etc. It also has the upside of attracting more students to your classes, and who wouldn’t want that?

3) A Decline in Standards: Sadly, I do think we have “lowered the bar” for an A here at Swarthmore — I have been in conversation with many faculty members who have spoken openly about trying to make their classes easier (and some students as well, who have noticed the trend in specific courses). I, for one, would love to see some of Swarthmore’s infamous “rigor” brought back (but perhaps not too much!).

How to solve this problem is complex, in part because it is a collective action problem. That is, no one professor or one institution has any incentive to combat the trend — it must be done in a collective way. One quick potential “fix” is to include each course’s average grade/GPA next to the student’s grade in that class on their transcript, so employers/graduate programs can contextualize the students’ performance.