In a post earlier this week, I noted how several state and local education leaders, advocates and especially the editorial boards of major newspapers used the results of the recently-released NAEP results inappropriately – i.e., to argue that recent reforms in states such as Tennessee and D.C. are “working.” I also discussed how this illustrates a larger phenomenon in which many people seem to expect education policies to generate immediate, measurable results in terms of aggregate student test scores, which I argued is both unrealistic and dangerous.
Mike G. from Boston, a friend whose comments I always appreciate, agrees with me, but asks a question that I think gets to the pragmatic heart of the matter. He wonders whether individuals in high-level education positions have any alternative. For instance, Mike asks, what would I suggest to Kevin Huffman, who is the head of Tennessee’s education department? Insofar as Huffman’s opponents “would use any data…to bash him if it’s trending down,” would I advise him to forego using the data in his favor when they show improvement?*
I have never held any important high-level leadership positions. My political experience and skills are (and I’m being charitable here) underdeveloped, and I have no doubt many more seasoned folks in education would disagree with me. But my answer is: Yes, I would advise him to forego using the data in this manner. Here’s why.
It is, I admit, tempting to justify my advice in a self-righteous manner – that is, to say that using NAEP data in this manner is wrong when Mr. Huffman’s opponents do it, and so it would be wrong for him to do it. But I am not quite so naïve as to believe that one can separate politics from policy, and I realize that people in positions of responsibility must make difficult decisions as to how to best advance their agendas (to his/their credit, Tennessee’s official press release didn’t make any inappropriate claims, but see this story in Education Week).
Luckily, though, I think there’s a strong practical case against misusing data, the first component of which is that using bad evidence represents an implicit endorsement of doing so, and insofar as bad evidence is inherently unreliable, it will often come back to haunt those who use it.
When I say that NAEP cohort changes are not policy evidence, I am not expressing a gut feeling. The evidence is clear – the fact that fourth or eighth graders in a given year scored higher than fourth or eighth graders in previous years can be severely influenced by (often unmeasurable) differences in the samples of students taking the test (also here). That is one big reason why these simple changes often conflict with more rigorous alternative measures. Moreover, even if the cohort changes could provide a good idea of whether the typical student improved over time, that says nothing about which policy or policies might explain the increase. This is particularly salient given that policies such as raising standards and new teacher evaluations, if successful, are likely to exert influence on aggregate performance over the long- rather than short-term.
And, if I was the head of a state education agency, with my reputation and, more importantly, the fate of my policy agenda riding to no small extent on test-based results, I can think of nothing more undesirable than having these outcomes subject to the gods of sampling variation or careless causal inference. In two years, NAEP results for Tennessee may very well be flat – or even decrease – and it may be due to nothing more than compositional shifts, or non-school factors, or even the simple fact that policies take time to work. In that case, Mr. Huffman may find himself stuck – if he has already used the NAEP results to his advantage, what will he say when they are not to his advantage?
(Side note: This is precisely the position currently occupied by opponents of market-based reform, several of whom have used NAEP and similar data to argue against these policies in the past, and now find themselves in the position of having to assert that they are not valid for this purpose.)
In addition, and perhaps more importantly, we need not rely on bad evidence. If you believe deeply in the merit of your policy agenda, then it is in your interest to evaluate its components directly and rigorously. There are many dozens of excellent researchers out there who would be only too happy to collaborate with state and local education agencies in this endeavor. Frankly, if I was in charge of a state or large district, and if I had the resources, I would make sure to have researchers designing an evaluation and collecting data right at the outset of any major policy change.
Now, to reiterate, I understand fully that positions such as Kevin Huffman’s entail immense political pressure, and that there are many people just waiting for any opportunity to shoot him down (just as there are many waiting to prop him up). And it very well may be the case that seizing on the NAEP data, though indefensible from a research perspective, is the smart political decision. I am not qualified to render that judgment.
I can, however, say, for whatever it’s worth, that any major education leader who passed up on the opportunity to score political points because it was inappropriate empirically would earn my unqualified respect and admiration, and I know many others who feel the same way. Furthermore, doing so could provide cover for others to follow suit.
If we’re looking for “bold leadership,” that would be about as good an example as I could possibly come up with.
- Matt Di Carlo