Those following education know that policy focused on “teacher quality” is by far the dominant paradigm for improving schools over the past few years. Some (but not nearly all) components of this all-hands-on-deck effort are perplexing to many teachers, and have generated quite a bit of pushback. No matter one’s opinion of this approach, however, what drives it is the tantalizing allure of variation in teacher quality.
Fueled by the ever-increasing availability of detailed test score datasets linking teachers to students, the research literature on teachers’ test-based effectiveness has grown rapidly, in both size and sophistication. Analysis after analysis finds that, all else being equal, the variation in teachers’ estimated effects on students’ test growth – the difference between the “top” and “bottom” teachers – is very large. In any given year, some teachers’ students make huge progress, others’ very little. Even if part of this estimated variation is attributable to confounding factors, the discrepancies are still larger than most any other measured “input” within the jurisdiction of education policy. The underlying assumption here is that “true” teacher quality varies to a degree that is at least somewhat comparable in magnitude to the spread of the test-based estimates.
Perhaps that’s the case, but it does not, by itself, help much. The key question is whether and how we can measure teacher performance at the individual level and, more importantly, influence the distribution – that is, to raise the ceiling, the middle and/or the floor. The variation hangs out there like a drug to which we’re addicted, but haven’t really figured out how to administer. If there was some way to harness it efficiently, the potential benefits could be considerable. The focus of current education policy is in large part an effort to do anything and everything to try and figure this out. And, as might be expected given the enormity of the task, progress has been slow.
The most oft-discussed form of this effort is designing new evaluation systems, which represents an attempt to measure “true” teacher quality in a valid, reliable manner. If one can measure teacher performance, then one is, of course, in a much better position to try to improve it.
The seductive variation in teacher performance, coupled with the inadequacy of current evaluation systems in many places, has compelled some (but not all) states and districts to rush ahead with this process without field testing, and to begin using their new evaluations in high-stakes decisions, including termination. There is, as yet, little indication as to whether these new systems (and, more importantly, how the results are used) will improve outcomes. There are plenty of strong opinions either way.
But evaluations and the decisions to which the results will be tied are really only one front in this all-out campaign, even if they are the element that gets the most attention. The assumed variation in teacher “quality” is arguably the primary motivation for any number of additional efforts, including but not limited to:
- Measuring the efficacy of teacher training programs, and holding them accountable for it;
- Altering compensation structures to attract/retain more qualified candidates;
- Videotaping lessons to see if there are practices common among more effective teachers;
- Expanding alternative certification routes;
- Experimenting with systems for collecting and distributing student data to inform instruction;
- Weakening of teacher job protections;
- Stepping up processes to screen applicants to open teaching positions;
- Limiting teacher sick days;
- Changing teacher layoff criteria;
- Trying programs for new teachers, such as mentoring and induction;
Some of these projects have been around a long time, others are relatively new. Some have broad support while others can be rather controversial.
But they all share a common foundational purpose – to try anything to harness some portion of the variation in teacher quality, which appears so large that even small improvements, though very difficult to bring about, might yield substantial benefits.
These interventions have thus far failed to garner much evidence (in some cases because the policies are relatively new). No doubt there will be successes and failures, though we’ll never know unless we evaluate these programs. The reasonable expectation is that even a set of well-designed and implemented policies (which is no guarantee) would produce small, gradual improvements in the distribution of teacher quality over a period of years and decades.
In the meantime, the allure persists, unwavering.
- Matt Di Carlo