Last year, the New York City Department of Education (NYCDOE) rolled out its annual testing results for the city’s students in a rather misleading manner. The press release touted the “significant progress” between 2010 and 2011 among city students, while, at a press conference, Mayor Michael Bloomberg called the results “dramatic.” In reality, however, the increase in proficiency rates (1-3 percentage points) was very modest, and, more importantly, the focus on the rates hid the fact that actual scale scores were either flat or decreased in most grades. In contrast, one year earlier, when the city’s proficiency rates dropped due to the state raising the cut scores, Mayor Bloomberg told reporters (correctly) that it was the actual scores that “really matter.”
Most recently, in announcing their 2011 graduation rates, the city did it again. The headline of the NYCDOE press release proclaims that “a record number of students graduated from high school in 2011.” This may be technically true, but the actual increase in the rate (rather than the number of graduates) was 0.4 percentage points, which is basically flat (as several reporters correctly noted). In addition, the city’s “college readiness rate” was similarly stagnant, falling slightly from 21.4 percent to 20.7 percent, while the graduation rate increase was higher both statewide and in New York State’s four other large districts (the city makes these comparisons when they are favorable).*
We’ve all become accustomed to this selective, exaggerated presentation of testing data, which is of course not at all limited to NYC. And it illustrates the obvious fact that test-based accountability plays out in multiple arenas, formal and informal, including the court of public opinion.
Some of the errors found in press releases and other official communications, in NYC and elsewhere, are common and probably unintentional (e.g., all three of the mistakes I discussed in this post). In other instances, however, results are misinterpreted in such a blatant fashion as to be a little absurd.
For instance, earlier this year, the New Jersey state education department issued a press release about achievement gaps that was rife with misinterpretations and overstated conclusions. They did the same thing in a state report on NJ charter school results, which they released at the same time they announced the governor’s plan to expand the charter sector.
For years, Chicago officials reported huge increases in student performance that were, at best, significantly overstated, while press releases coming from D.C. Public Schools over the past couple of years have gone to great lengths to present lackluster results in a favorable light (for example, see last year’s release, in which the headline points to “continued progress” among a subset of students, even though proficiency rates declined in most grades [making things worse, DCPS does not release scale scores, which precludes a full comparison between years]).
The root cause of this annual ritual is obvious: In some places, test scores are political life and death for elected and appointed officials. Reputations are so linked with results that the presentation of data can become an exercise in political messaging rather than informing the public with realistic, level-headed analysis of student performance.
In other words, not all test-based accountability is codified in policies like evaluations or school rating systems. The stakes are also informal and based on public perception, but they are no less high.
No matter what you think about the overall emphasis on testing in our education system, the politicization of annual results is an undesirable, perhaps inevitable side effect, one that is unlikely to abate any time soon. This is unfortunate given that cross-sectional, short-term test score trends are often as much, if not more a result of error and external factors than concrete policies or leadership.
In any case, it makes the job of education journalists exceedingly important. One hopes that they view all press releases with even more skepticism than usual, especially in places where the focus on test results is intense. The data, interpreted properly, can be very useful, but, by themselves, they’re not appropriate for judging individuals or policies, especially in the short-term. That may not make for particularly engrossing stories, but sometimes being a little boring is a public service.
- Matt Di Carlo
* Incidentally, the “all-time record” message was also used in announcing the previous year’s graduation results, which showed a 2.4 percentage point increase in the rate.