On this episode of the Academic Medicine Podcast, authors Eric Warm, MD, and David Hirsh, MD, and medical student Kate Jennings join host Toni Gallo to discuss the unintended consequences of the shift to pass/fail grading in undergraduate medical education and current challenges in the residency application and selection process. They explore the feedback culture and incentives for pursuing clinical excellence in medical school. They also present the complex adaptive system model as a framework to consider the dynamics at play and ways to improve the transition to residency.
This episode is now available through Apple Podcasts, Spotify, and anywhere else podcasts are available.
A transcript is below.
Read the article discussed in this episode:
- Warm E, Hirsh DA, Kinnear B, Besche HC. The shadow economy of effort: Unintended consequences of pass/fail grading on medical students’ clinical education and patient care skills. Acad Med. 2025;100:419–424.

Transcript
Toni Gallo:
Welcome to the Academic Medicine Podcast. I’m Toni Gallo. On today’s episode, I’m joined by Doctors Eric Warm and David Hirsh, authors of “The Shadow Economy of Effort: Unintended Consequences of Pass/Fail Grading on Medical Students’ Clinical Education and Patient Care Skills.” Also, joining us is medical student Kate Jennings.
In our conversation we’ll talk about what the residency application and selection process looks like today, how we got here, and the impact that’s having on medical students and the learning environment. Then we’ll get into some possible pathways forward to address the current shadow economy of effort that Eric and Dave talk about in their article. With that, let’s do introductions. Eric, Dave, you want to go first and then Kate?
Eric Warm:
Sure. My name is Eric Warm. I am the Vice Chair of Medical Education at the University of Cincinnati. I’m also the Internal Medicine program director, a role I’ve had for 15 years. And I have been recruiting to residencies for the past 28 years, so I have some experience in looking at medical school applications and applicants.
David Hirsh:
My name is David Hirsh. I’m a professor of medicine and the Associate Dean of Undergraduate Medical Education in Cambridge, Massachusetts for Harvard Medical School, and I work clinically at Cambridge Health Alliance in the Department of Medicine and the Department of OBGYN. My interest is in medical education transformation and I enjoy that through a research lens and as a teacher.
Kate Jennings:
Hey everyone, I’m Kate Jennings. I’m a third year medical student at the University of Cincinnati College of Medicine. I’m very passionate about medical education. I recently had my paper on the association between electronic health record metrics and clinical performance accepted to Academic Medicine. Also, in the competency based medical education pathway at my school, which is a pilot pass/fail program. I’m so excited to be on this podcast and share my experiences.
Toni Gallo:
Congrats on the paper, Kate. That’s awesome. And welcome all of you to the podcast. I’m looking forward to our conversation. I thought we could start today with where are we? What is the residency application and selection process look like today? What’s the baseline where we should be starting this conversation?
Eric Warm:
It’s like the proverbial elephant in the room where you’re only seeing a part of it, but from the perspective of the residency program directors, we’re getting record numbers of applications for people for whom we have very difficult time differentiating between the applications. Most of the data that we receive is not predictive of performance later in residency, and so we’re left to our own devices to try to figure out about somebody. We often trade 15 minutes of an interview as more important than four years of a record of somebody in medical school because we don’t trust that record, and there are many perverse incentives that lead that record to being not as accurate or honest as it could be. So from our perspective, we’re getting a lot of applications from many people that we can’t differentiate, and it’s hard to understand what is valuable coming from the medical school record. I’m sure it looks different from what Dave is seeing on the undergraduate side.
David Hirsh:
I think the undergraduate side, that’s obviously vexed by two things that shouldn’t be competing, but I think appear to be competing. So on the one hand, we want to propel our students as far forward as possible clinically. It can be as able at science and clinical science and caring for their patients, truly, truly masterful in the care of patients. So that’s I think an obvious… and it’s the ancient sacred agenda, but it is of course not separate from, it cannot be separated from their reasonable personal goals of getting their residencies, getting onto the next step of their lives. So when those things are confluent, good things happen. The degree to which those things could compete with one another, those two goals, I think that sets us up for the difficulty which we might be finding around this discourse and pass/fail.
Kate Jennings:
As a medical student, I can add that this is a time defined by so much stress for medical students. I think all this competition is highly motivating and pushes us towards excellence but with so much self-inflicted pain. As Eric and David wrote about in their paper, in this complex adaptive system, the solutions are really murky.
Eric Warm:
In medical education, we’re having this thing called competency-based medical education, which sets a criterion for which you must reach. And pass/fail is one of those criterion. It’s an example of a criterion above which is good and below which is bad. So you’re not really having normative comparisons between people when you have pass/fail. It’s either… it’s dichotomous, one or the other, and everyone above the line looks the same. Which is laudable and makes sense, but the world outside of that test is completely normatively compared. We just introduced ourselves. Dave is from Harvard, I’m from Cincinnati. The listeners have just decided about us either explicitly or implicitly because there’s judgments made before you get to medical school, when you’re in medical school, when you’re in residency about what good and less good is, and that all is happening at the same time that we’re trying to create criterion referencing in grading. And this juxtaposition is really difficult. And the question that’s out there is can you have competency-based medical education in a world with grades the way we envision them now? And I don’t have the answer to that except it’s causing conflict.
David Hirsh:
I think there’s an area where Eric and I have had hardy debates. We share a view that regardless of the system one uses for discrimination, let’s say whether it’s tiered grades or pass/fail or some combination, what part of the curriculum those reside in, if not the entire curriculum. Regardless, it seemed that we must have the highest bar, a very, very high bar, what we expect the students to be able to know, do and be. So you can imagine, a tiered grading system which doesn’t advance people pass some low minimum standard, would not be acceptable and neither would a pass/fail system which has… it sets a very low bar for excellence. So I think one of the things in addition to the fact that we have this strong urge to make sure we focus on the patient and making… building for the most excellent in caregiving is also the idea that the bar has to be high notwithstanding any system or ideology around pass/fail versus tiered grading.
Eric Warm:
Here’s where Dave, I absolutely agree that we should minimize the gap between the best and the worst. Now what criterion are we using? That’s the rub. When we think about… the major point of our paper is that we’re not really talking so much about pass/fail, but what is the purpose of medical school? It’s to create great clinicians and yet we have perverse incentives to put us away from that. So part of our problem is we don’t even agree on what the bar is. I’m going to guess, I’m going to ask Kate here, Kate, did you feel like great physicianhood is the thing that gets you into residency or is it something else?
David Hirsh:
Right, there’s a set of incentives that a student would experience, Kate. So when you’re thinking about what drives you or what’s the aspiration, that may be one question, but what’s practical might be a different question. What’s your sense?
Kate Jennings:
I’m just going to talk out loud a little bit. I’m trying to really reflect on what motivates me most. And when I get down to it’s really my own internal motivation to try to do better all the time for myself and for my future patients. I’m really not trying to say that to be cheesy. I’ve always felt very grounded in that.
Eric Warm:
But do you think that’s going to get you into a residency program? Because that doesn’t go on a piece of paper that I’ve seen other than your personal statement and everybody says that. So how will you differentiate yourself from every other person who says those exact same words?
Kate Jennings:
Of course, I’m going to leverage all this medical education research and experience I have. No doubt that that is the strongest thing that I have. And I’m going to try to do as well as I possibly can on Step 2, even being on this pass/fail pathway, I feel that I need to excel on Step 2. So you’re absolutely right. I still have to do all the things.
David Hirsh:
Kate’s thoughtful response actually helps us think about another element as we get towards this notion of a shadow economy. So one idea of course, again there there’s very limited evidence around a lot of the aspirations for pass/fail generally. This is not to say that it’s not good… or I’m not making a comment about its merits. I’m just commenting on the lack of evidence doesn’t tell us either way. The evidence that does exist appears to be principally in the pre-clerkship time. And even if you were to rate the evidence as Alex Iyer who’s a student at Harvard Medical School, he and a team of people have done this very thoughtful review of all the empirical evidence in this domain, and none of the papers actually rank highly in a standard research way of assessing quality of data. It’s not to obviously impugn the authors, it’s just that the data are new. I mean, it’s an emerging field.
I raised this though because … all these aspirations which may end up being… having empirical basis later are driving this debate. But what is problematic of course is that there are secondary unintended consequences. So our paper was hoping to raise not a critique of one grading system versus another, but to raise this large issue of when you make sweeping change, we must be very, very careful to consider unintended consequences. So I wonder if that’s a topic we could go into too in this talk.
Eric Warm:
And that’s what started it all David, is that when you see these large complex interventions put forth and then in the writing, there’s no prediction of what might happen afterwards. It’s interesting. We do this a lot in medical education. We get so caught up in our ideas and we throw them out there as if it was a panacea for some set of problems. And then we are surprised when this other thing happens. I think we should not be surprised when people who are rational actors act in ways that serve their self-interest. We could have asked that question beforehand or model that all the stuff we’ll talk about in a minute about how to, not predict, because it’s hard to predict in a complex adaptive system, it’s one of the components of it, you have emergent order, you don’t know what’s going to happen, but not be surprised when something you didn’t expect happens. That I think has happened in our medical education world.
Toni Gallo:
So let’s get into some of the unintended consequences of the change to pass/fail grading, the change to pass/fail Step 1 scores. You’ve talked a little bit about some of that, but what are some of the things that we’re seeing? You described this shadow economy of effort in your paper. What does that look like?
Eric Warm:
In the paper, we run through some of the statistics for what people are doing in order to differentiate themselves. If grading can no longer do that, if whatever place you went to can no longer give that to the residency program, there might be some other way to differentiate and it’s become an arms race.
We quote a statistic where the 90th percentile of neurosurgery applicants will have 58 abstracts or presentations or papers, 58 in four years. That’s unbelievable. And I think the median was somewhere around 30. And you can go down to every specialty, you’re going to find that “more is better.” So what we’re worried about is that, well, if you’re spending all this time working on that stuff, how are you taking care of patients and how are we assessing that? And there’s absolutely nothing in there that’s valid. And by in there, I mean the medical school application that I can see that shows us how good someone is as a clinician and how well they can do the care that they’re about to do when they become an intern. And there’s tons of papers that show that what is in those files is not predictive.
So we’re worried about this thing called the triple harm. We’re worried that people will devalue clinical excellence in favor of these other things. Get burned out anyway because it’s hard to do 58 abstracts, presentations, and papers in four years, and then some of that work is probably not as authentic as it could be or deeply thought out and people pad their CV. And so this triple harm that I just described I think is what we’re worried about as a consequence of behaviors that we’re seeing.
David Hirsh:
We’ve tried to discuss this whether in talks or in the paper, the degree to which social psychology or other domains of knowledge may help us think it through. It’s not that any particular theory is entirely explanatory, but we can call upon them to help us frame the argument. So for example, Eric mentioned earlier this idea of rational actors or rational actor theory. And if you think about this, human beings not only do, but probably should, act in their own best interest. It’s completely reasonable for a person in medical school to want to go have the residency of their choosing. Residency selection associates with all sorts of very important parts of your future. And Eric can recount these if you wish. But it’s very, very important. So it’s entirely rational for one to seek that.
But what happens is an individual seeking that may generate the behavior for individuals around them to also seek that and then the individuals around them. And at some point you have a very, very competitive environment where all sorts of rational people acting reasonably make an overall society, which is acting in what Eric just referred to as an arms race.
So the very large numbers of volunteer experiences, work experiences, abstracts and presentations and research experiences which are now needed to prove that I am worthy of your residency, that I am different than my peer. That’s generating an environment, which seems to me to be at risk of not improving the well-being and perhaps worsening the burnout. Or in the very least, pushing it downstream to the next phase of the curriculum where the grades actually still exists.
Eric Warm:
So what Dave is describing there is something called Campbell’s law. I think we wrote about this in the paper. We use this quote, Campbell’s law says, “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” So in this case where we tried to make pass/fail decrease the burnout and the stress on a resident, we actually, I think, made it worse and there’s quite a good bit of evidence for it. And what is interesting about what the ERAS did, the organizing device that puts medical student data in front of residency directors is they decided to put 10 meaningful experiences. They set an anchor of 10. Now the medical student has to think, well, before I could do as many as I wanted or few as I wanted, now I have to do 10. And the number of times that I saw less than 10 this year was near zero because now we’ve got to fill that space. And so there is a cost of all this, and I think the cost is that it didn’t help medical student burnout. It might’ve made it worse, certainly didn’t make it better or the promises of better didn’t come true.
Kate Jennings:
Something I thought that was interesting in the paper and also in our conversation here is this framing that the students have to find ways to stand out. But I think what is also true is that the residency programs have to find ways to sort the applications. And aren’t students correct that these significant extracurricular activities and dedication to research are highly valuable and are helpful in the process?
Eric Warm:
Kate, I love that question because I can’t read all the stuff that you send me. It’s too many. I had 3,000 applications to my program last year. Each of them filled with hundreds of things and it’s a sad state of affairs. Now, here’s something to consider. Maybe, and I’m not a neurosurgery doctor, so I don’t know, I don’t want to speak for them, but I’m projecting to their life. Maybe the fact that somebody is willing to do 58 abstracts, presentations, and papers to crush themselves in medical school so hard is in fact what neurosurgeons want in their applicants, so maybe they’re getting what they want even though they don’t read all those papers and abstracts, they just look at the number. That’s possible. That’s not for me in internal medicine, we’re not doing that.
David Hirsh:
Although Eric, in the paper, we also did the work of researching whether it’s predictive, whether doing these work experiences, volunteer experiences, research experiences, predicts future clinical skill or clinical outcomes. And the answer is at this moment, we don’t have any data that suggests that those things actually predict excellence in residency or beyond. We do have knowledge from the great work of David Ash and others and then more recently that where are you trained and how you do when you’re training does predict your future patient correctivities beyond residency. But we don’t have the suggestion that doing all these extracurricular things can predict how you do in residency. Now I think a reasonable critic would say, “Yeah, but how does pass/fail or tiered have any impact on that?” But that’s a worthy question. I would just only answer saying we don’t have the data. So it’s being I think too fundamentalist, but either grading system or only thinking about grading systems as the way to improvement would be to act based on one’s heart rather than on the available evidence.
I might also throw one more thing into the mix of it, just in the spirit of keeping this all spicy, which is that Eric led a paper earlier in which he describes the prisoner’s dilemma and applies this notion to the residency match. There’s many things about the paper that I love and one of which though is it’s honest about three different competitions. There’s the student versus student competition, there’s the school versus school competition, and there’s the residency program versus residency program competition. So I think when Kate, you asked that great question, it reminds me of all the different kinds of competing that’s going on. And I would just offer to… offer one little thought. That kind of competition would be interesting if it raised the bar or improved health care outcomes or made patients safer and healthier, that’d be great. I wonder though whether it’s actually just adding all sorts of backdoor conversations and secondary and tertiary systems and I think the feeling of not good competition and more strain and stress. So I just throw that into our conversation if that’s helpful.
Kate Jennings:
I think it is very appropriate and correct that we’re centering patient care and patient outcomes in this conversation. But something I would add, thinking about the neurosurgery applicants who are doing all this research is that still may be preparing them for a competency in that specialty. Though it may not be patient care, that specialty may expect that their physicians are doing this research on the side. To think of another example, with family medicine they expect that their physicians are very active in the community, maybe are doing advocacy on the side. I was just thinking about that as well.
Eric Warm:
Those are great thoughts, Kate. I would just respond to say, this is now my perspective only. I would rather see somebody do a really great in-depth couple of things than to do many, many things where they’re basically peripherally involved. And so I’ll get all these volunteer things where people tell me they did a blood pressure screening for one hour on a Tuesday and that’s on their list, right? I’d rather that you were in AmeriCorps, build Habitat for Humanity, or sat with a … person for a year. So how do you judge the quality of those experiences when there’s so many of them? And are you really, as a medical student, getting what you want? I don’t know the answer to those, but I would ask those questions on the way to whatever you put on your CV. And I think we made the point in the paper that we definitely do quantity over quality in most of these things because we can’t tell the quality from the paper.
David Hirsh:
But I think I have an emerging worry if I might add one more, which is the patient lying in the bed. So the clinicians enter the room, they lean over the bed, their name tag swings off their chest or hip and the patient sees the name and feels comforted, I hope, by their caregiver. They’re not actually interested in knowing whether someone did extracurricular activity. They’re not interested in even knowing whether the person is brilliantly published. They want, I believe, to be healthy and cared for and ideally be at an institution and with a provider that have the top health outcomes for some real reason. So I’m not saying this to win in the argument. I just can imagine if I turn the whole thing around and imagine it through the eyes of a patient, and by the way I say a patient that means me, us, any of us, I think we’re deeply hopeful, we’re often deeply fearful. I think we’re deeply hopeful though that the care we got is of the highest order. I think that would be the principle interest from that perspective.
Toni Gallo:
You’ve all talked in different ways about this idea of teaching clinical excellence, measuring it, and then actually passing on that information from the medical school to the residency program. The information that’s being shared maybe is not reflective of that or is not as helpful in distinguishing between residents. What information actually gets measured and shared and how does that part of the application and selection process work? And where might there be opportunities to do better there?
Eric Warm:
This is a complex and controversial question that we’d spend 10 hours on, so I’ll try to get to some thoughts. Okay, so when you’re a medical student, you’re learning, you’re not expected to be great at your job yet. And when you’re measuring in the formative space, that is high risk if that formative stands in for the summative at some point, that gets passed on to your residency program. So separating those two things is really important. Even though we may have formative pieces of data that we collect on a resident, they almost always assume it’s summative because the risks are too high for them. So we just have to know the psychology of the person receiving the data is often not what we intend as the educators. So very complex.
Second, when you start to use clinical data or outcomes data in an assessment, and we’re starting to do that at my place, it’s very, very complex. This has to do with what’s called the attribution versus contribution continuum. And Kate actually… her paper, you should read her paper, it’s on this. So attribution is what I delivered, what I did. Contribution is I was on a team that did. And when you’re a medical student, you’re almost never the one delivering the exact care. So it gets very interesting. How would you know this person’s contribution on a team in which the attending, the senior resident, the intern, the nurse, and everyone else made decisions that actually led to the care outcome? So it’s a very, very complex issue.
And I’ve just given two things, which is we want to be careful using formative data because then people don’t have psychological safety to grow because they have to be grade at the beginning. And we don’t want that to be conflated with summative data. Students will often think it is, so will faculty sometimes when they’re giving feedback. And the second issue is how can I even tease out what a medical student has contributed in the arena? There are lots of people thinking about that. That’s probably a topic for a different podcast, but those are things that are so fraught.
And then you put up one more thing, which I don’t know if it happens at Dave’s school, but happens at ours. If I’m an assessor and I want to give feedback to a person who’s getting a grade, and I give them feedback that’s a little bit tough, I’m almost certainly going to have that reflected in my own feedback because we’re humans and if we give something negative, the person gets upset because the stakes are high and we receive that back. And so we’re incented not to give honest feedback or put that in any writing, because we don’t want to have that reflect on our own behavior going forward. It’s a tough dynamic. Humans are really complex with their psychology.
So no easy answers. Toni, I’m sorry I gave such a long answer, but we have to think about these things because then what ends up happening is that nobody wants quote, to tell the truth, for all those reasons I’ve just expressed to you, and that’s what ends up in the MSPE, the Medical Student Performance Evaluation and letters of recommendation. We actually just wrote a paper that suggests we should stop writing and reading letters of recommendation, because the value is so low. So right now the state of the state is not good.
If you ask me what I would like is validity evidence to say that the students from a medical school are good doctors, and then I would only need the imprint of the medical school with no data at that point to say this person is ready to go. And then what I would like is once I’ve accepted their residents into my program, then to see the actual formative data and say, now let me begin your curriculum from here. Creating a true continuum between residency and medical school. We don’t have any of that right now. So we get the incentives flipped and we get the behaviors that we see.
David Hirsh:
I’m part of a really special group that’s looking at what we’re calling the walking on eggshells problem. So there are Washington State University, Florida Atlantic University, University of Minnesota Medical School, and the University of Colorado the Anschutz School of Medicine. These four schools are together… The people at these schools noticed, the students noticed and the faculty noticed that each party, faculty and students, were walking on eggshells vis-a-vis the other. And what that was leading to was what Eric just described, where students felt they are unable to give frank and full feedback to their faculty, faculty felt they could not give frank and full feedback to their students. The level of interactivity and what you might call the safety of the discourse or the fulsomeness of the discourse was then greatly reduced. And the consequence for education, particularly one where it’s so much based upon skill and where the learners come in quite far from where we need to have them end up when they finish school, feedback, needless to say, this is absurd to say, is just fundamental. And things that challenge feedback so greatly as the inability to give it honestly, that’s obviously a substantial problem, which will also itself have unintended consequences.
So I just wanted to I guess, hold that up after Eric makes the point. You can imagine without feedback, we have a problem for growth. Without feedback, we also have a problem for the subsequent decider like residencies, knowing what they’re getting when they receive this packet.
Eric Warm:
What’s interesting about Kate’s job, Kate, you don’t have grades the way typical people have, so feedback might be more helpful to you. I wonder how you feel about it.
Kate Jennings:
Yeah, I actually think that has been one of the most helpful things about the pass/fail program is that taking away all these little pressures to try to maximize my eval, try to be liked. I can really put myself out there and challenge myself and get really good feedback. And what I hope is that my evaluators are able to be more honest and constructive in their feedback because they know that it’s not going to hurt me. Because I do think that giving and receiving feedback is a highly emotional experience, for the interpersonal reasons that you had mentioned. I’ve now had some of my own experiences where I’ve had the opportunity to give feedback to other people. And every time I’ve held back, there’s a hierarchy there being a student, but not even because I’m scared of my grade or scared of negative consequences for me, but because I am afraid of hurting their feelings or souring a relationship or souring a memory of us together. So I definitely think that’s there.
David Hirsh:
That’s a question because I’m compelled by how you describe that, Kate. Part of this I worry is a bit hackneyed because people will always raise examples of let’s say of the US military for example, but let’s just offer experience of the US military academies. I’ve had an extraordinarily wonderful student I’ve worked with recently. Again, I think that US military academies or service in the US military or maybe collegiate athletes or maybe, I don’t know, dancers or high level musicians. People who’ve had the kind of experiences where feedback is voluminous and crisp and clear and it’s demanding of the utmost response to feedback to reach the utmost level. And it’s wrapped into all sorts of other notions about what we might call professionalism or decorum or upholding the oath, whatever, very high, worthy human notions of what it means to be excellent.
Anecdotally, you’ll hear all sorts of stories from faculty anyway about how those students seem different. But even if we let that aside for a second, could we just ask ourselves why certain venues seem to be places where feedback can be quite robust, and you might even say crisp, quite frank, with high expectations. And other environments, and let’s call it medical school, where there appears to be a feedback problem. I think there’s something in this that we should contemplate.
Kate Jennings:
I think what you’re describing is the culture of a space and what the culture is and how we have to all be in it together to feel comfortable giving and receiving that feedback. I wanted to offer a funny story. On my surgery rotation, I got the feedback not to say yay in the OR. And I remember the resident giving me this feedback, he seemed very nervous to give it, very nervous about how I would respond to it and if he would hurt my feelings by telling me that. And I responded and I said, “Thank you for that feedback. It is very specific. It is a behavior that I can change immediately and I actually was not bothered by it at all.” Just a funny aside.
Eric Warm:
I want to answer Dave’s question a little bit differently, which is the sports metaphor doesn’t really work for me because in sports or in dance, there is a hierarchy, there’s a pyramid, there’s a winner and there are losers. And in medicine everyone’s supposed to be a winner. That’s just the way it is. And in sports you have clear defined outcomes. You’ve either won the game or you lost the game, you won the race, you lost the race. And medicine, especially in the clinical arena, we don’t have those clear outcomes. You don’t know if you did good today if the outcome doesn’t happen when you rotate off service or if the attending leaves. It’s such a murky thing that we don’t know the true outcomes of our behaviors. We don’t know what good and bad is in real.
And you mentioned David Ash’s study. Well, in that study, just for those who don’t know that study, 2009, they looked at obstetric complications in four million deliveries in New York and Florida and found that some residency programs had high complication rates and some had low complication rates. And it turned out that if you came from a program with high complication rates, you had lower performance for your career. I think they measured up to 23 years afterwards. If you came from a high performing program, you were high performing for your career. The problem is the people in those programs didn’t know if they were in the high or low performing programs. None of us do. And so we don’t even know the outcome of the game that we play. So it’s a game of incomplete information that makes it difficult. And so if everybody has to be a winner and nobody knows if you won, the default is to say, well, we won. We’re winners. And it’s hard to differentiate all the true “winners” in that environment.
David Hirsh:
Yeah, I’m reticent to go like toe-to-toe in sports metaphors or anything else. But I think the thing that I’m getting at… there are two things. One is it seems that certain learners and certain teachers, I’ll call them, have, to use Kate’s framing, a different cultural notion of what it means to give frank and clear feedback with high expectations. So in the very least, even if we’re in this murky land that you’ve described, Eric, and it’s far more complex I suppose than a sport could be, I guess, but it doesn’t strike me that it’s different in its expectation, or it shouldn’t be different in its expectation of having very, very high standards and having a never-ending pursuit of improvement. Obviously there’s no such thing as the great doctor. There’s only obviously great enough and then the inexorable charge to be greater yet on behalf of your patients. So I am just troubled about unintended consequences of any system that sets us up for lower bars or the inability to speak frankly about things that would improve us further and further
Eric Warm:
That existed before pass/fail. Because in the coaching metaphor with sports, the athlete knows if I listen I’ll do better. But in medical school, sometimes medical students feel, “I don’t want your feedback unless you tell me I’m great because anything that’s less than great might hinder my grade.” And that existed before pass/fail. It certainly exists… Well, in a differentiating world, it’s going to be there always. What’s interesting for Kate’s perspective is this year she’s going pass/fail and then fourth year is going to be differentiating again. And she mentioned Step 2. So a risk to her of doing pass/fail was that she may not get into a competitive specialty or a residency because she doesn’t have that differentiation. So many medical students hide their weaknesses in order to get a grade. That’s the perverse incentive that faces medical students in a normatively competitive world that existed before pass/fail. Pass/fail produces another set of perverse incentives.
Kate Jennings:
The other perverse incentive is to ignore anything that doesn’t directly go towards a grade in some way. Something I was reflecting on is in the preclinical environment, we have this clinical skills course once a week, and I think that’s a really great curriculum for developing those essential clinical skills. But during that time, we’re taking exams every two weeks and it’s really hard to dedicate mental energy towards that hour which isn’t concretely associated with a grade. So even when we make attempts to add elements that will help students be stronger physicians, those can fall to the wayside and even students can resent those additional activities when there’s not a compelling grade associated with it.
David Hirsh:
This point you raise Kate, it feels spot on to me for this dialogue around unintended consequences and the different kind of, whether it’s social psychology or economics or other fields that have looked into human motivation and human behavior. Many people will often… you’ll hear this quoted, this great paper called A Fine is a Price by Gneezy and Rustichini. This is the so-called daycare study from 2000. And many will know this, but for those who don’t know, essentially people like me would sometimes be late picking up their child from daycare, if they’re so fortunate to be able to have daycare.
So the daycarers in this study were rightly bothered by the parents arriving late, and as the story goes, they would put a fine, a monetary amount, out there that you’d have to pay if you were late. The assumption would be of course, that if there’s a fine people will get ship-shaping, they’ll be on time. But of course, as the story is… as my rambling is suggesting… people ended up not only not improving their lateness, there were more people were late because as the paper was entitled, A Fine is a Price. It was factored in, the price was put into this.
And there are many, even much more current examples, this debate around not using the SAT, this happened around the time of COVID, to not use the SAT because of a worthy hope of having a more diverse set of applicants getting into schools and colleges and universities, to find all sorts of wonderful applicants who might otherwise not get in if the SAT was considered to be a barrier. But it turned out when they stopped using the SAT, the response was the opposite of what they had hoped. They actually had less diversity around the very people they were seeking to have enter these colleges. So the SAT was restored. There are many, many, many examples, well-researched examples with data. At Dartmouth College, they had very clear data. It was even published in many places including The Wall Street Journal demonstrating this.
So I just want to throw out there, my long salvo here is to throw out there that we set up systems with certain teleological rationale. And I think the rationale for pass/fail made sense and even one would say even makes sense. But the question I have to offer is, are we seeing that which we hope for? And I think your point, Kate, raises this powerfully for me. Are we seeing all the things we hope for? We may end up seeing them, we may not end up seeing them, but in the very least, schools need to be measuring quite closely in real time, real time knowledge of what our systems are doing.
Toni Gallo:
In your paper, you present this model, the complex adaptive systems model, to look at this. Rather than offer specific suggestions for this transition from undergraduate to graduate medical education, you suggest that we think about that time as a complex adaptive system and within that take into account these unintended consequences. So maybe you can briefly talk about what a complex adaptive system model looks like, and maybe how it could be helpful here as we’re talking about residency application and selection.
Eric Warm:
So, much of what Dave and Kate talked about I think feeds into this. So a complex adaptive system is defined by… first element is it has distributed control, which means no one’s really controlling all the minds out there. No one’s told Kate what to think. Kate has people that she talks to, but there’s no way to control the system. But the second part, distributed control followed by connectivity, but everything that you do touches another person in a web. We’re in this complex web that’s human. This co-evolution that when I act, I change the system, you act and we act together, not knowing what the other might do.
The third or fourth thing is the sensitive to initial conditions, which means if we are hypercompetitive at the beginning, we’re probably going to be hypercompetitive in the thing that we do. The fifth is probably the thing we’ve been talking about the most, there’s a sense of emergent order in a complex adaptive system, which means you cannot predict what’s going to happen. It gets built together and every action has a reaction. And then from that, there’s another reaction. It’s constant back and forth about predicting. The sixth part is we’re far from equilibrium. Humans don’t act in equilibrium. We’re always moving, moving, moving.
And then lastly, there’s a state of paradox. You have stability and instability, you have order and disorder, and you have cooperation and competition. These are the elements of the complex adaptive system. You often see them in places with volatility, randomness, uncertainty. And that is I think how most medical students feel about the match.
So I’ll stop there as a description of what the elements are of a complex adaptive system and then see what Kate and Dave think. But then we can talk about the thing that we said in our paper with humility is that therefore in a complex adaptive system, I cannot tell you what to do, but we can do some things to reduce the perverse incentives that we set up to drive towards better shared outcomes. I’ll stop there and see what Dave and Kate have to say.
Kate Jennings:
I find myself actually wanting to step out of the system and look at it and see that we have these students fighting to differentiate themselves. And then residencies trying to figure out what data is going to translate to students being most prepared for residency so that they can then select those students that are most prepared for residency and have the best training with the best. But all of us need to be prepared to serve our community. And so I feel like the selection process itself is an example of normalized deviance.
David Hirsh:
I think the point is very important. We’re a system within systems within systems. So the getting into college system is also setting certain amounts of things people have to fill out on their so-called common application and lest you not have enough items to fill it out. And then the schools are choosing based on a bunch of things which aren’t necessarily about predicting who will do well in the schools or whether they’ll contribute most to society. And then within colleges and universities, people are competing in certain ways. So I think to lay this all on medicine is not fair or accurate.
I will say that within medicine, I would wish for what I’m hearing Eric call for, and the paper calls for, which is some alignment. This maybe our normalized deviance could get back towards something very, very special. Some alignment between running for excellence, everything working for better and better and better at agreed upon main goal, which is learning as much as I possibly can in order to serve that patient in front of me and the population of patients in my community, whatever that means. This could be specialist or generalist, academic or community community, just the utmost we can do to be great at that. And agree that that is the goal. And then have the other incentives at least be as close as possible aligned around that goal.
I’m inclined to say one more thing. As I hinted at earlier. This can’t only rest on pass/fail versus tiered grading or on whether you have longitudinal integrated type clerkships or traditional block type clerkships or whether you have a school that is a two plus two, two year so-called basic science, two years clinical science or a three-phase curriculum or any of these other modeling things that we do, whatever we do, it would seem to me that it should at least be for the health and wellbeing of my patient and our patients and the utmost ability of that and we should just, I think, work really, really hard to think about that before we implement something new.
Eric Warm:
I love that, Dave. It’s about, imagine if we started with the validity evidence that says that this model, whatever model, whatever grading system, could we do the David Ash study for institutions and people in those institutions, so we would know what better was? I listened to both of you there and there was an interesting set of language, Kate, said she wanted the best, which how do you define that? We don’t even know. We don’t agree. We don’t have metrics that are actually valid there. And then David said excellence. And again, we don’t agree what excellent is. And I don’t know somewhere between excellent and best is probably where we should end up.
But imagine that you were in that David Ash study and you were a residency program and you found out that you were one of the lower performing programs. What would you do if you knew that? And that had nothing to do with rankings that are public, but it had to do with your performance as creating doctors who are good at their job and their job defined by take care of patients. I would want to know that. We don’t have a system in this world right now to do that, but perhaps that would align incentives if we could do that in a way that was meaningful.
David Hirsh:
I think it poses a fun thought question though. If you knew and it didn’t matter, if there was no consequence to knowing that you were completely mediocre, how would that change behavior to if you knew and it did matter?
Eric Warm:
Let’s say that you were rated in U.S. News and World Report as a top residency program and that’s how you were viewed by the world, but you had this inside information that you were performing poorly at taking care of patients, what would you do with that information? You wouldn’t make it public. So yeah. So we’re again back to what humans do with incentives.
David Hirsh:
Yeah. I don’t find this cynical. I just think … you’ve, with me privately on numerous energetic phone calls, we have talked about dominance hierarchy. I don’t want to express our relationship to the public so much, but we’ve had really, really interesting conversations about whether people compete and rank, systems that compete and rank. And I think that we have to think about, yes, motivation, incentivization, but also this general idea of, I don’t want to misquote this or misrepresent it, but the struggle upward. Going to do better for some reason. And I think in circumstances where there would be no consequence or very little consequence or maintenance of mediocrity had some high value, it seems less likely that we would be motivated towards the difficult work of change. I imagine that all med schools and hopefully all med students and faculty are strivers who seek to do better and better. But then the question I would get is the system set up to make better and better in service to the thing we actually care about most?
Toni Gallo:
One of the ideas that’s come up a number of different ways in our conversation today is all of the different groups that are involved in the residency application process, whether it’s the information that’s being collected and shared from the medical school, how learners are behaving, what residency programs are looking for? So there’s this very complex system with lots of different groups and people involved. How do you think about everybody’s role in that and how one group can’t fix it by themselves, everybody has to come together. We’ve talked about what is the information that we want to know? How are we defining excellence and how are we thinking about what are the qualities that we should look for in our applicants? So how do you think about all of those different groups and the way that they need to come together here?
Eric Warm:
In medical education we often start with learning objectives and everything in your curriculum should drive towards those objectives. And so again, what is the objective of medical school? If you look at the output, which is the Medical Student Performance Evaluation and all the rankings that David mentioned, it seems the objective is to differentiate between medical students and not create good care. So I would want every person along the chain from the first year medical student to the dean to say the learning objective or the objective is to produce high quality doctors and then drive everything that we could towards that. Now we have to define what that is. So again, when you define a learning objective, you and your presenters have to agree on what you’re presenting. So what is it that would create high quality physicians and then how could we measure that in a meaningful way? And then how can we work it all back from that to create good care?
And the point that David and I made in our other paper, which is that again about social dominance hierarchy, we’re always going to have a best and the worst. It’s harmful though when the worst is very far away from the best. So I’m very practical, in a world in which discrimination between best and worst will always be because we’re humans, we should allow for that. At the same time making the harm of the gap less and less and less. And if I was the person setting up the system, I would think about it that way.
Kate Jennings:
I would love to see more communication with students about the purpose of the standardized tests we take and what evidence is or isn’t behind them. Because what I’ve observed is there is a lot of student skepticism towards these tests. A feeling that a test doesn’t capture my clinical skills. I’m more than a number comparing me to other people. And that is an appropriate and protective response in a system, at least like ours, we’re on quartiles at University of Cincinnati where by definition half of students are going to feel less than and a quarter are going to feel like, “I’m on the bottom.” And I also think that students are right to have some skepticism about the utility of the tests, although it seems to be one data point that we have with a correlation to patient outcomes, which is very small, it’s still very nuanced about what the tests do and don’t capture.
David Hirsh:
I’m super interested in this framework, Kate, and I have to just say thank you so much for sharing that because I’m drawn back now to maybe other systems that I’ve not been part of. I was not a college athlete. I have not served in the US military, but I’m imagining from what I know from students who I’ve been close with and listened to over the years, there’s something in this for me about agency.
So in certain circumstances, if you’re in the bottom quartile, you’re very, very energetically fired up to not stay there. Now maybe that sets up competition, which is hurtful or in some ways not analogous to the needs of a medical school compared to these other places where that happens. On the other hand, maybe there’s something to do with selection or as you said earlier, culture, something where it’s not remotely a skepticism of a system or woe is me or what’s wrong with me, but rather I’m here to inextricably continue to get better. And if I find out news I’m not doing well enough than I’m even more fired up than I was before.
I’m not saying that I have that strength. Maybe I aspire to it, but observing other educational venues or learning about them. I’m left to think that medical school just appears very different in exactly the way I think you just characterize. You characterize one school. But I am familiar with that across many, many dozens and dozens of schools I’ve been to in this country and others.
I guess one more thing I’d say is I’ve also been fortunate enough to sit in law school classes and business school classes at some very well-regarded schools. And the discourse and the way in which those classes proceed is quite different than I observe in medical school, along the lines of I recognize that I am being judged right now as I speak. And I’m going to come in really, really prepared. And when I speak, I’m going to be very, very clear and cogent. If I don’t get it right, I’m going to get it right next time for darn sure. I observe this. So I mean that’s also professors at these schools say they’re trying to imbue.
So I just want to throw that into our mix also because there may be something about either our selection or the cultures we’re creating or maybe something in medicine which is good in this regard or something which maybe leaves us still not as ideal as we’d want to be.
Toni Gallo:
All right. So as we come to the end of our time, I want to give each of you a chance if you have any final thoughts you want to share, anything you want to leave listeners with, maybe anything that didn’t come up in our conversation, I’ll give each of you a chance to share those thoughts. Dave, you want to go first?
David Hirsh:
I guess I feel like a listener deserves at least one of the three of us, if not all of us, to weigh in a little bit. I’ll just say I came to the question of pass/fail versus tiered grading as agnostically. I just was interested in knowing what was out there and what were the arguments and were the evidence-based and so forth. But the longer I have read about unintended consequences, the longer I’ve shared this back and forth with colleagues and students, plenty of students, the more I’ve been involved in researching and thinking about this, the more I’m compelled to think of two things.
One is, I think we have not taken the unintended consequences seriously enough. And I think if we did, we would have some counterweight, the pass/fail movement. I’m not saying we shouldn’t have pass/fail, I’m just saying we’d recognize a pretty hearty counterweight. So that’s one thing. And the second thing is, I think humans are indeed motivated by both internal and external stimuli. And I say this because I think we all know that. So it’s good if those are aligned and it’s good if my personal needs are matching my service needs. I think it’s not a good system if my personal reasonable need to get into a residency or a location later in life is somehow in conflict with being the most attentive, diligent, able, committed doctor I can be.
So these things I will just tell you are leaning me increasingly towards tiered grades with one or two caveats. I don’t want to talk too much, but just one or two caveats. One is bias is a scourge that we have to address. So the degree to which tiered grading could concretize bias in our system would need to be addressed. I will say that pass/fail also has a bias problem, which is that those… people figure out other ways to get their message to residencies. So there’s an entire shadow economy of information transfer that people who are more connected, traditionally more connected have that others might not. So we should look for bias in both systems. And I think the second thing I would say is that as we make any change forward or backward, it’ll have unintended consequences. So this idea of constantly measuring, closely measuring to see what’s happening in the short and long term as a must.
But all that being said, I’m putting my tiny little nickel on the table and say I’m leaning more and more towards tiered grades because I believe it is better for clinical excellence. It’s better for patients, I think. There’s no data. There are no data, but I think it lines up for that. And I think it might be a way if we do it well, of reconnecting the personal residency goals and the clinical excellence goals. But I could be wrong, and one would have to measure.
Toni Gallo:
Eric?
Eric Warm:
In our paper, we offer with humility, again, steps, and you can read the paper and see the steps. I think two of them, I would just underscore, and I think David made the point. In a complex adaptive system, you need to incentivize outcomes or activities. So in this particular one, to shift from accumulating numerous experiences to meaningful experiences in however we define meaning. And align those incentives across the entire system, which we’re not doing now. And the second, which I think is interesting to me, just because I’m so interested in, is that when we make these big changes, they’re going to affect thousands or tens of thousands of people, we could use scenario planning before implementing that to try to understand what might happen or what rational actors would do, modeling those unintended consequences as best we can, so we can understand it ahead of time and maybe engage experts in complexity science as we do this work rather than just us folks who don’t really know much about it making these big changes. So those would be two things I would add. There’s more in the paper, if people are interested in reading it.
Toni Gallo:
Kate?
Kate Jennings:
Yeah, I just wanted to share one last thing. As a student who transitioned to a pass/fail pathway, I was surprised that that posed its own challenges and that when I was initially on the pass/fail pathway, even then I felt like I need to find a way to compare myself to other people. I don’t know where I stand. I felt unmoored. And what I had to do is I had to find my own grounding principles. I had to find my own way to say, even though I don’t have some quantitative feedback telling me where I stand, I’m proud of the work I did today, I achieved the goals that I set out to do. And so my ultimate takeaway in all this for students on a pass/fail pathway or not, we have to find our own values. And I think that can give us a sense of agency back in all of this.
Toni Gallo:
Well, I want to thank you all for being on the podcast today, for talking about your paper and hopefully giving our listeners some different things to think about as they’re thinking about this transition from undergraduate to graduate education and all of the different pieces that are involved here and how we can move forward. So thanks very much to all of you and I encourage our listeners to read the paper that we talked about today, which is available now on academicmedicine.org.
David Hirsh:
Thank you so much, Toni. What a joy.
Kate Jennings:
Thanks, guys.
Toni Gallo:
Remember to check out the article we talked about today, as well as other articles that look at the residency application and selection process. Those are all available on academicmedicine.org.
From the journal’s website, you can also access the latest articles and our archive dating back to 1926, as well as additional content like article collections. Subscribe to Academic Medicine through the subscription services link under the journal info tab, or visit shop.lww.com and enter Academic Medicine in the search bar. Follow us and interact with the journal staff on LinkedIn at Academic Medicine Journal. Subscribe to this podcast anywhere podcasts are available. Be sure to leave us a rating and a review when you do. Let us know how we’re doing. Thanks so much for listening.