The 5-10 Percent Solution

Posted by on December 16, 2010

** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post.

In the world of education policy, the following assertion has become ubiquitous: If we just fire the bottom 5-10 percent of teachers, our test scores will be at the level of the highest-performing nations, such as Finland. Michelle Rhee likes to make this claim. So does Bill Gates.

The source and sole support for this claim is a calculation by economist Eric Hanushek, which he sketches out roughly in a chapter of the edited volume Creating a New Teaching Profession (published by the Urban Institute). The chapter is called “Teacher Deselection” (“deselection” is a polite way of saying “firing”). Hanushek is a respected economist, who has been researching education for over 30 years. He is willing to say some of the things that many other market-based reformers also believe, and say privately, but won’t always admit to in public.

So, would systematically firing large proportions of teachers every year based solely on their students’ test scores improve overall scores over time? Of course it would, at least to some degree. When you repeatedly select (or, in this case, deselect) on a measurable variable, even when the measurement is imperfect, you can usually change that outcome overall.

But anyone who says that firing the bottom 5-10 percent of teachers is all we have to do to boost our scores to Finland-like levels is selling magic beans—and not only because of cross-national poverty differences or the inherent limitations of most tests as valid measures of student learning (we’ll put these very real concerns aside for this post).

Before addressing the argument directly, it bears noting that this policy, even if it went down perfectly, would not be a quick fix. The simulation does not entail a one-time layoff. We would have to fire the “bottom” 5-10 percent of teachers permanently. Then, according to the calculation—and if everything went as planned—it would take around 10 years for U.S. test scores to rise to level of the world’s higher-performing nations.

It also seems improbable that we could ever legislate, design, and carry out such a policy on a large, nationwide scale, even if it had widespread support (which it doesn’t). Yet that’s what would be needed to produce the promised benefits (again, assuming everything went perfectly).

But what if we could do it? Would it work? As I said, there would almost certainly be some increase in overall test scores, at least in the short-term (whether or not that would signal proportional true improvement is a different matter entirely). But would the gains be large and sustained? It’s always difficult to project the impact of an untried, drastic intervention like this, but I would argue probably not. In fact, there is a risk that this type of policy would end up hurting overall education performance in the long run, especially in higher-poverty, hard-to-staff schools and districts.

The presumed benefits of this proposal rely on several shaky assumptions, some of which would, if violated, carry negative consequences. One assumption, which I have discussed before, is that the replacement teachers will be of sufficient quality (on the whole) to produce at least average student test score gains. Hanushek’s calculation assumes that the replacements will do so (though, among other things, it’s unclear whether he uses the average gains for a first-year teacher, which are lower).

Currently, around 8-9 percent of teachers leave the profession every year, and this will probably increase as baby boomers retire. Maintaining the deselection might place substantial strain on the labor pool (of course, there would be some overlap – teachers who would be fired under the proposal would have left anyway).

In particular, high-poverty and other hard-to-staff schools—which already have problems finding good new teachers—would have to replace even more teachers every year, while choosing from an ever-narrowing applicant pool (it seems that much of California is in trouble right now). The assumption that the quality of replacements would remain stable is rather unsafe, and the calculation hinges on it.

Moreover, you can bet that many teachers, faced with the annual possibility of being fired based on test scores alone, would be even more likely to switch to higher-performing, lower-poverty schools (and/or schools that didn’t have the layoff policy). This would create additional, disruptive churn, as well as exacerbate the shortage of highly-qualified teachers in poorer schools and districts.

When all is said, it’s conceivable that, taking the firings, attrition, and switching into account, the total annual mobility rate for all teachers could approach 25 percent, and it would be much higher in poorer school and districts (making these students bear a disproportionate burden for this unintended consequence). It’s hard to imagine a public education system that could function effectively under those circumstances, let alone thrive.

Remember also that a widespread test-based firing policy would almost certainly change the “type” of person who chooses to pursue teaching (or, for that matter, chooses to remain). I find it hard to believe that any top-notch applicant would be attracted to a low-paying profession because of a systematic layoff policy (see here for an alternative view). There’s no way to know, but my guess is that the opposite is true. If so, the policy’s projected benefits would be further mitigated.

The simulation also assumes that all the dismissed teachers would leave the profession permanently. Again, this seems highly unlikely, especially if replacements are in short supply. Rather, I would speculate that a significant proportion of dismissed teachers would get jobs in other districts. In doing so, they would seriously dilute the policy’s effects, while also creating needless turnover for schools.

Then there is the issue of error. Due to the well-known imprecision of value-added models, and the year-to-year fluctuation of teacher effects, many replacement teachers would be no better or worse than the fired teachers would have been (error will be particularly high among newer teachers, due to small samples). There is something unethical about firing people based solely on measures that may be wrong due to nothing more than random statistical error, yet these mistakes would have to be tolerated, as collateral damage, in the name of productivity. But, if the replacement pool runs dry, there would also be practical consequences: we will have fired many solid teachers, whom we might have identified as such with more nuanced measures.

Finally, on a similar note, the quality of teachers who constitute the “bottom” 5-10 percent varies by location, and by poverty level (though not drastically). Imposing a widespread dismissal system would therefore result in the deselection of many teachers who would have done quite well in a different school or district. Firing these teachers solely to meet a quota is a harmful practice (again – especially if there are shortages).

In short, this proposal would be slow, risky, unfair, and it would require us to deliberately engineer test score gains for their own sake—in the most brutal manner possible. It would also be, I argue, unlikely to work, not to anywhere near the advertised degree.

Is this really our best option?

Hanushek doesn’t think so. Talking about the systematic firings, he notes, “In the long run, it would probably be superior…to develop systems that upgrade the overall effectiveness of teachers.” He points out, however, that these efforts have not been successful in the past. But have we really tried?

Instead of trying to fire our way to the high performance of Finland or anywhere else, why not try to emulate the policies that these nations actually employ? It seems very strange to shoot for the achievement levels of these nations by doing the exact opposite of what they do.

In any case, Gates, Rhee, et al. constantly repeat the “fire 5-10 percent” talking point, along with the promise of miracle results, because of its potent political message: all we have to do is fire bad teachers, and everything will be fixed. They use Hanushek’s calculation to provide an empirical basis for this message. They do not, however, seem at all attuned to the fact that the proposal is less an actual policy recommendation than a stylistic illustration of the wide variation in teacher effects.

Let’s stick with meaningful conversations about how to identify, improve, and, failing that, remove ineffective teachers. Test-based measures may have a role in the evaluation of both teachers and overall school performance, but not a dominant one, and certainly not an exclusive one.

Systematically firing large numbers of teachers based solely on test scores is an incredibly crude, blunt instrument, fraught with risk. We’re better than that.


17 Comments posted so far

  • Moreover, you can bet that many teachers, faced with the annual possibility of being fired based on test scores alone, would be even more likely to switch to higher-performing, lower-poverty schools (and/or schools that didn’t have the layoff policy)

    But see the recent paper by Figlio/Sass and others showing that teacher value-added scores in Florida and NC aren’t that different in high-poverty schools. If the teachers to whom you refer don’t know the difference between a simplistic look at test score levels and a value-added system that takes poverty into account, perhaps it would be helpful if people who do know the difference didn’t muddy the waters.

    Comment by Stuart Buck
    December 16, 2010 at 10:02 AM
  • Stuart – thank you for the comment, as always.

    I am very much aware of that paper. In fact, it is cited in this post, along with the very point that you make: that VA scores are lower in high-poverty schools, but not drastically so. I suppose I might have repeated it, or moved it to the section you quote, but I don’t think I muddied the waters.

    Anyway, you’re correct that teachers who think they’re *guaranteed* to get better VA scores in lower-poverty schools are misinformed, but your claim that “a value-added system takes poverty into account” is true only to a degree (as you know). There are unobserved advantages to working in a lower-poverty school that the models don’t capture (e.g., peer effects, school environment), even if they don’t translate into huge aggregate differences. Consider also that, as Sass et al. find, the returns to experience seem to be stronger in lower-poverty schools.

    Finally, I would bet that many movers would be motivated by working conditions, rather than job security (I might have made this more clear). For example, teachers might move to more affluent schools to avoid the exacerbated turnover problem that many high-poverty schools would likely face under this policy.

    Thanks again. Please keep reading and commenting.

    Comment by Matthew Di Carlo
    December 16, 2010 at 12:54 PM
  • Thanks for these clarifying remarks. I hadn’t seen that you linked to that paper.

    I do think that if there’s a problem of teachers migrating away from poorer schools under any regime that takes value-added into account, that migration will be mainly the result of teachers failing to understand what value-added really means. Even if researchers can’t fully take everything into account, they can select the basis of comparison: to other teachers in the same school or to teachers across a district. It seems intuitive to me that if teachers in a poor school are being compared only to other teachers within the same school, it is much harder to make the excuse that poor value-added scores are due to anything unobservable about the school — the other teachers to whom you’re being compared suffer from the same school-wide obstacles. (Other objections to value-added remain, of course, such as a small n for any given teacher.)

    Comment by Stuart Buck
    December 16, 2010 at 1:17 PM
  • Thanks for the detailed and thoughtful analysis of an idea that usually receives only a superficial glance. I also agree with you, and I think polls support the idea, that work conditions are the #1 motivating factor in staying at a school or leaving.

    Stuart, I must take issue with your comment about teacher comparisons within schools. There is a considerable obstacle in the fact that teacher-student assignments are not random. Some VAM research found “false positives” – correlation between fifth-grade teachers and fourth-graders test scores. I don’t think I’ve ever heard of a school engaging in any truly random assignment unless it was for a study! And as an elementary school parent, I wouldn’t want random assignment – I prefer thoughtful assignment. Then as a secondary school teacher, I can tell you that there are HUGE variables among sections of the same course, with students drawn from the same pool. The pushes and pulls on a high school schedule ensure that certain clusters will form and move through their day together. If your class meets at the same time as certain honors or remedial classes, you’ll have the contrasting group disproportionately represented in your room. If you teach one high-needs special education student with an instructional aide, there’s an extra adult in the room and usually a positive effect. Same student, different time of day, no aide, and the class is harder to teach. I could go on and on.

    Comment by David B. Cohen
    December 16, 2010 at 7:22 PM
  • David,

    You’re correct: There’s a pretty solid body of evidence showing that non-economic working conditions – rather than salary or job security – are the primary factor driving mobility decisions.

    The characteristics of students seem particularly important. For example, see:
    http://edpro.stanford.edu/hanushek/admin/pages/files/uploads/Hanushek+Kain+Rivkin%202004%20JHumRes%20392.pdf

    But salary does matter too:
    http://faculty.smu.edu/millimet/classes/eco7321/papers/clotfelter%20et%20al%2003.pdf

    Here’s a good review of the retention literature:
    http://www.aera.net/uploadedFiles/Publications/Journals/Review_of_Educational_Research/7602/04_RER_Guarino.pdf

    Thanks for the comment.

    Comment by Matthew Di Carlo
    December 16, 2010 at 7:53 PM
  • Just to be clear, since there appears to be some confusion, nothing in these calculations or in the accompanying article says anything about test-based decision making or firing. Value-added measures do provide information, but nobody advocates making decisions solely on the basis of such scores.

    What the article says is that the bottom teachers are harming kids and that we need to find a way to do something about that. The best would be to transform these teachers — through coaching, professional development, or what have you — into better teachers. Unfortunately, we have been unable to find a way to do that systematically and consistently.

    The continual citation of Finland does not help either. What the Finish have learned is how to make sure that an ineffective teacher does not remain in the classroom for very long. This is something we have to learn in the U.S.

    I also do not understand why the vast majority of hardworking and able teachers are willing to be lumped together with the small number of truly ineffective teachers. It surely is not any confusion about who the ineffective teachers are. Parents, other teachers, and principals do appear to know who the ineffective teachers are.

    Developing a good evaluation system for teachers would be a start. Again, we have talked about that for many years, but it has not happened in many districts.

    Comment by Eric Hanushek
    December 16, 2010 at 10:31 PM
  • [...] [...]

    December 17, 2010 at 2:56 AM
  • David — non-random assignment is one of the other potential problems to which I referred, but I don’t see how it has anything to do with the problem I was addressing: the allegation that teachers will leave high-poverty schools in droves because they will be afraid of low value-added scores. If teachers in high-poverty schools are compared to other teachers within the same school, then the fact that the school is high-poverty — in and of itself — ought to have no effect on the value-added scores.

    Comment by Stuart Buck
    December 17, 2010 at 9:19 AM
  • Mr. Hanushek,

    Your comment is much appreciated.

    While I understand what you’re saying about the confusion, I do think I characterized your argument in the manner you describe. I pointed out that it wasn’t an actual policy proposal, but rather an illustration. I also noted your position that improvement is the preferable course. If this was not clear enough, I apologize.

    Nevertheless, your own words are easily misunderstood. In this chapter, in the front end, you write, “This discussion provides a quantitative statement of one approach to achieving the governors’ (and the nation’s) goals – teacher deselection. Specifically, how much progress in student achievement could be accomplished by instituting a program of removing, or deselecting, the least-effective teachers?” And the approach consists of deselection based entirely on value-added estimates.

    This type of statement might be easily interpreted in a manner quite different from your comment. Surely you know how subtlety is lost in our public discourse, and how, taken literally, your calculation represents the intoxicating promise of a “quick fix.” And, indeed, I have heard many people misuse your research to advocate, implicitly or explicitly, for a policy of systematic firing based solely or predominantly on value-added estimates. Perhaps you aren’t aware of how often this happens.

    So many people with whom I have spoken were surprised, reading my post, to learn that you favor, albeit with skepticism, improvement over dismissals. Correct that misperception. I realize you’re a researcher and not an advocate, but your voice carries a lot of weight. When you speak to reporters and policymakers, I hope you lead off with the improvement message. I hope you tell them that evaluations and other measures to increase effectiveness should be our priority. For whatever it’s worth, you’d get tremendous support from many people, including many thousands of the great teachers you celebrate.

    Thanks again,
    Matt

    Comment by Matthew Di Carlo
    December 18, 2010 at 9:43 AM
  • Why, Mr. Hanushek, do “able teachers” not wish to separate themselves from “truly ineffective teachers?” Because there, for the grace of God, go I. Many of the teachers who have thus far received that unfair label in Los Angeles were actually very good teachers who chose to work with the most challenging students. Teachers like Rigoberto Ruelas, who received this unfair label.

    We see through the smokescreen and know that the data is faulty and not a true measure of a teacher’s worth. We reject the labels placed on teachers through this faulty measurement. And we will not be divided to facilitate the dismantling of our profession because someone has to stay behind to protect the students from the privatization forces that see both teachers and students as a dollar sign or data point, or in your case, a percentage.

    Martha Infante
    California Council for the Social Studies Teacher of the Year 2009

    Comment by Martha Infante
    December 18, 2010 at 1:05 PM
  • Great post, thanks for this. What I can’t understand as a psychologist is that the economic models seem to ignore the psychological consequences. As you point out, firing, especially if it is perceived as arbitrary (leaving aside for a moment whether it actually is arbitrary), has an impact on everyone. It changes pedagogy, and narrows curriculum. If people know that the bottom 5-10% will be fired every few years, it will destroy any chemistry that a school needs to thrive.

    I also find it obfuscatory for Hanushek to claim this:

    What the article says is that the bottom teachers are harming kids and that we need to find a way to do something about that. The best would be to transform these teachers — through coaching, professional development, or what have you — into better teachers. Unfortunately, we have been unable to find a way to do that systematically and consistently.

    This takes a tenuous, uncertain relationship (that teachers are the most important in school factor for predicting growth in student test scores) and assumes that the best way would be somehow to “transform” the teachers themselves. As Dan Willingham points out, this is not an immutable truth of the world, but a fact of our system. If we had a standard curriculum, or more support, or smaller class sizes in general, this might not be the case. “Coaching, professional development, what have you” assumes that his model (of the relative importance of teaching) is set in stone.
    Further, as he has in other areas, puts forth the fiction that “resources don’t matter.” We’ve tried professional development, we’ve tried coaching, we’ve tried spending more per pupil, and nothing is working, we should scrap these approaches. Sure, you could point out that we spend more per pupil, but this requires that you ignore the details of how we have done this. DC for example, mismanaged how they administered the Special Ed programs, vastly inflating their per pupil costs. Does this mean that since costs per pupil went up, and test scores didn’t, resources don’t matter?

    Comment by Cedar Riener
    December 18, 2010 at 1:48 PM
  • Just because we haven’t figured out a way to do something systematically and consistently, may mean that a systematic solution doesn’t exist. Rather than looking for another systematic solution, it might be better to leave some control to the schools to implement their own non-systematic, inconsistent solutions.

    Comment by Cedar Riener
    December 18, 2010 at 1:53 PM
  • The Huff Po report on VAMs http://www.huffingtonpost.com/2010/12/23/teacher-layoffs-seniority_n_800771.html and layoffs reported:

    “Dan Goldhaber, lead author of the study and the center’s director, projected that student achievement after seniority-based layoffs would drop by an estimated 2.5 to 3.5 months of learning per student, when compared to laying off the least effective teachers.”

    But the report says:
    “Teachers RIFed in our simulation are approximately 20% of a standard deviation in student performance less effective in student performance than teachers RIFed in reality.”

    Goldhaber misquoted the Boyd et al study of 2010, but then he said his results were similar to their conclusion that The typical teacher who is laid off under a valueadded system is 26 percent of a standard deviation in student achievement less effective than the typical teacher laid off under the seniority-based policy. 7Boyd then says that the gap would shrink as the new teachers gained experience.

    Am I missing something or is this a huge bluff by Goldhaber et al? After all, the latest Gates MET study’s conclusions contradicted their findings.

    John

    Comment by john thompson
    December 23, 2010 at 8:24 PM
  • I think I understand now what the Goldhaber report says and what it means. I just can’t tell what they actually did.

    Yes, they ran a simulation and the teachers RIFed would be estimated to be less effective by 2.5 to 3.5 months. I did not understand that that was what they were doing for two reasons.

    Firstly, that would require them to run that simulation for each district and each subject matter. Had they done so, I figured, they would have said that was what they did. They have a convoluted footnote that might address that but I couldn’t figure out what it means, and I assumed that such an effort would be reported in the text. So, maybe they did that but did not mention it in prose. But, I wonder if they just ran a macro simulation for the entire state where the bottom 145 teachers were RIFed without regard for whether the replacement worked in that district or not. (Boyd didn’t have that to worry about, and speaking of that their typo threw me off also) That would be intellectually dishonest, but if they used a more complex alternative method of running the more complex simulation, would they have not said they had done so.

    Secondly, they described six different VAM scenarios, but they never said precisely which one they used. Had they chosen one method for the simulation, I thought, they would say which one they used. But they promised a VAM that took into account comparability of schools and districts. So, I read the article wondering if they had started with some generic VAMs for the simulation, and used six more refined models for their final tables.

    Rereading again, they said at one point that they incorporated parts of scenarios #2 and #3, and at another point #2 and #5. I didn’t understand #5,

    Regarding #2 they said its weakness was that it could not account for student, classroom, and district characteristics. But then they said it could be adjusted to reach those characteristics. But they didn’t whether they did that or not in the simulation.

    So when I read and reread the report, I concentrated on the Tables, which was the only place where they said what they were controlling for.

    I have to say that even though Im not a statistician, I never had these problems reading acedmic papers, on say econometrics. There are reasons why scholarship has certain conventions, and if the Gates people followed them, their reports would be more intellectually honest.

    But now I realize they meant what was reported and that their simulation would mean that RIFs based on effectiveness would increase student performance by up to 3.5 months. They just didn’t say how they did the simulation. I’m assuming now that they meant that using VAMs to determine which 2200 teachers get layoff notices would mean that the students of 145 teachers would benefit by that much, but still they say little or nothing on how they reached the headlined conclusion, in contrast to the detail they announced for minor points.

    For instance, could they be running a simulation where those gains would be produced by replacing a senior teacher with low test score gains by a teacher in another type of school in another district who had high gains? That seems too absurd. But it also seems absurd that they run a simulation without reporting what that simulation was.

    Sorry to bother you about this.

    Comment by john thompson
    December 24, 2010 at 1:24 PM
  • Great theoretical discussions but no one has mentioned the overriding factor in new teacher hires – money. You can fire as many as you want but the discussions in my school district is about going out to hire inexpensive/inexperienced teachers not the most experienced/expensive teachers with a proven track record.

    And really, with the number of teachers that leave the profession every year through retirement or just being fed up and the this decimation, is there any way to replace 500K to 700K new teachers a year to fill the gap?

    Comment by Michael
    December 28, 2010 at 12:38 AM
  • [...] actually plays out as simulated in the various performance-based layoff simulations which I, and others have recently discussed. The assumptions in these simulations are bold (unrealistic), and much of [...]

    January 7, 2011 at 10:58 AM
  • [...] quickly, particularly if we punish the low performers hard enough. One of their pet claims is if only we fired the worst 5% of teachers, the US would equal Finland in education quality, a claim that’s a second cousin to the spherical [...]

    January 24, 2011 at 1:55 AM

Sorry, the comment form is closed at this time.

Disclaimer

This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from shankerblog.org. The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the shankerblog.org may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.

Banner image adapted from 1975 photograph by Jennie Shanker, daughter of Albert Shanker.