The Perilous Conflation Of Student And School Performance

Unlike many of my colleagues and friends, I personally support the use of standardized testing results in education policy, even, with caution and in a limited role, in high-stakes decisions. That said, I also think that the focus on test scores has gone way too far and their use is being implemented unwisely, in many cases to a degree at which I believe the policies will not only fail to generate improvement, but may even risk harm.

In addition, of course, tests have a very productive low-stakes role to play on the ground – for example, when teachers and administrators use the results for diagnosis and to inform instruction.

Frankly, I would be a lot more comfortable with the role of testing data – whether in policy, on the ground, or in our public discourse – but for the relentless flow of misinterpretation from both supporters and opponents. In my experience (which I acknowledge may not be representative of reality), by far the most common mistake is the conflation of student and school performance, as measured by testing results.

Consider the following three stylized arguments, which you can hear in some form almost every week:

Only one-third of our students are reading at grade level; our schools are failing;
95 percent of the teachers in this district receive satisfactory ratings, but that can’t be accurate, because only half the students are proficient in math and reading;
These reforms are working – state test scores have risen steadily.

All three of these inferences are inappropriate for one primary reason: they fail to acknowledge that raw, unadjusted testing results – whether actual scores/proficiency rates or changes in those scores/rates – are not, by themselves, credible measures of school performance. They are largely (imperfect) measures of student performance. There is a difference.

Everyone involved in education knows that most of the variation in testing outcomes is “between students” – i.e., has to do with factors, most unmeasured/unobserved, that are attributes of the students themselves and their upbringing and environment (such as English proficiency, oral language development, background knowledge, family situation, etc.).

This well-established finding is sometimes interpreted to mean that schools (or teachers) can only exert minimal influence on student performance. That is false. Not only are schooling factors among the only targets within the purview of education policy, they can also be very influential. Improvements in the quality of schooling/instruction can have substantial effects on student outcomes (though I sometimes think we need to be more realistic about the pace of change).

Nevertheless, learning is complex and much (if not most) of it occurs outside of schools and/or before children reach schooling age. Test scores – and changes in those scores – are subject to these influences. A school with low test scores is not necessarily a “failing school," just as a school with very high scores is not necessarily successful.

Similarly, one should not assume that a school’s slow score growth is necessarily caused by a problem in that school. The reason why the research on school (and teacher) effects is so complex is that much of it is geared toward controlling for all of the external factors that can be measured and are known to affect outcomes. In other words, the analysis is trying to isolate that portion of student performance that can reasonably be attributed to school performance. A great deal of the raw variation is also simple random error.

Yes, when a group of students' test scores rise over a few years, that’s a pretty good tentative indication that the school is doing something correctly. But it’s all a matter of degree. The gains (assuming they’re even measured with longitudinal data, which they often are not) will also reflect factors (e.g., prior achievement levels) that have nothing to do with the school, to an extent that can vary widely. If you rely solely on unadjusted testing results, you don’t know. And if you don’t know, you risk making decisions based on erroneous assumptions.

The worst part is that this distinction – between tests as measures of student performance versus school performance – is ignored by policymakers just as frequently as it is in our public discourse.

States are closing schools, handing out ratings and awarding grant money based on horribly flawed misinterpretations of raw testing data. It’s one thing for journalists and the public to make this mistake; it’s something else entirely for the people we rely on to decide education policy to make it too.

In short, I would be a lot more optimistic about “data-driven decision making” if so many of the decision makers weren’t such erratic drivers.

- Matt Di Carlo

Blog Topics

The political stakes are too high to use standardized tests in a moderate and reasonable way.

Politics demand that they be used in the service of ideology.

Good point. The problem is even more acute in high schools, where in many cases tests at a single grade are the sole measure of a school's performance. If the grade is 10th, these scores will say a lot more about the experiences of kids before they entered the school than what happened to them while they were there.

Some states have attempted in the waiver applications submitted last November to develop broader measures of school performance, particularly for high schools. But these proposals are getting some resistance from critics who claim that they will dilute the focus on the student achievement. What are poor, well-meaning policy makers to do?

"This well-established finding is sometimes interpreted to mean that schools (or teachers) can only exert minimal influence on student performance. That is false. Not only are schooling factors among the only targets within the purview of education policy, they can also be very influential. Improvements in the quality of schooling/instruction can have substantial effects on student outcomes"

The part about the purview of ed policy has nothing to do with the power or usefulness of those policies.

And, if you're going to say that schooling factors can be very influential, you should tell us how and to what extent, especially since you say that "most of the variation in testing outcomes is “between students” – i.e., has to do with factors, most unmeasured/unobserved, that are attributes of the students themselves and their upbringing and environment." which seems to belie your claim up there.

Question: Sanders claims that all outside variables can be accounted for by controlling for previous academic growth. Is this true? and if it is wouldn't it then provide an accurate picture of student growth and school quality?

John Doe, no.

I would venture to say that anytime anyone says anything about statistics and uses the phrase "all outside variables" doesn't understand statistics or reality.

Education Reform:

Teacher's are being held to higher standards of accountability in the classroom for student learning. What is a proper method for involving parents?

How can parents share in the accountability for their child's learning while in school?

Be sure to like Battlegrounds: America's War in Education and Finance and stay current about our educational system. Thanks for your support!!