There is a tendency in education circles these days, one that I’m sure has been discussed by others, and of which I myself have been “guilty,” on countless occasions. The tendency is to use terms such “effective/ineffective teacher” or “teacher performance” interchangeably with estimates from value-added and other growth models.
Now, to be clear, I personally am not opposed to the use of value-added estimates in teacher evaluations and other policies, so long as it is done cautiously and appropriately (which, in my view, is not happening in very many places). Moreover, based on my reading of the research, I believe that these estimates can provide useful information about teachers’ performance in the classroom. In short, then, I am not disputing whether value-added scores should be considered to be one useful proxy measure for teacher performance and effectiveness (and described as such), both formally and informally.
Regardless of one’s views on value-added and its policy deployment, however, there is a point at which our failure to define terms can go too far, and perhaps cause confusion.
For one thing, I’m starting to get concerned that too many people, particularly those outside of education, may take terms such as “performance” and “effectiveness” at face value. Upon hearing these incredibly powerful words, they may not be aware that they almost always are defined — when they’re used in empirical arguments — in terms of test-based effectiveness among a minority of teachers (i.e., those in tested grades and subjects), usually in just one or two subjects.
They may not realize that these estimates sometimes do not match up particularly well with alternative indicators such as observations (or between subjects, tests, etc.). Or that most of the most heavily-cited analyses, well-done and important though they are, employed models that few states are actually using in their accountability systems. Or that, until recently, many of them used data from a relatively small group of states and districts, data that were collected before those states and districts began attaching stakes to the estimates. There are, as with any measure, a lot of underlying issues here.
Moreover, based on recent events, most notably the recently-concluded Vergara trial in California, I am worried that this “mindset” may be slipping into other realms. For instance, I have no legal training whatever, but I could not help but notice something when watching the closing arguments of the plaintiffs’ attorney in Vergara , as well as the eventual decision. That is: The plaintiffs’ case relied overwhelmingly on value-added research, and both the closing arguments and the written decision used the terms “ineffective,” “performance,” etc., many dozens of times, but not once did either mention what they meant by that – i.e., in general, they were talking only about value-added estimates.
Similarly, I’m pretty sure that many advocates and commentators who make statements such as “value-added is the best predictor of future performance” are fully aware that “future performance” is typically defined, in a somewhat circular fashion, in terms of value-added. But I’m not sure everyone who hears those kinds of statements realizes that (or how to interpret these kinds of findings in a useful manner).
Now granted, in most policy fields it is common to use words like “performance” interchangeably with proxies for that performance (e.g., GDP and the economy); shorthand can be useful, and no measure is perfect. Moreover, practically nobody involved in the education debate would argue that value-added estimates provide a complete picture of teacher performance/effectiveness, or even anything resembling a complete picture. Finally, virtually all the limitations of value-added estimates are shared by most other measures of teacher effectiveness, including classroom observations (and it bears mentioning that other indicators, such as certification, have also been used interchangeably with terms like “better teachers”).
That said, the increasing tendency to equate these estimates — or any measure, for that matter — with teacher performance or teacher effectiveness without qualification or elaboration can go too far. Although the research on value-added is relatively well-developed (arguably more so than that focused on other performance measures), there is still much work to be done. Humility and skepticism regarding what these estimates can tell us about teaching and teaching quality are in order, particularly during a time when they’re just starting to be used in high stakes decisions. At the very least, we should certainly use caution in how we describe them.
And, again, I include myself in the need for this warning.
- Matt Di Carlo