There is a tendency in education circles these days, one that I’m sure has been discussed by others, and of which I myself have been “guilty,” on countless occasions. The tendency is to use terms such “effective/ineffective teacher” or “teacher performance” interchangeably with estimates from value-added and other growth models.
Now, to be clear, I personally am not opposed to the use of value-added estimates in teacher evaluations and other policies, so long as it is done cautiously and appropriately (which, in my view, is not happening in very many places). Moreover, based on my reading of the research, I believe that these estimates can provide useful information about teachers’ performance in the classroom. In short, then, I am not disputing whether value-added scores should be considered to be one useful proxy measure for teacher performance and effectiveness (and described as such), both formally and informally.
Regardless of one’s views on value-added and its policy deployment, however, there is a point at which our failure to define terms can go too far, and perhaps cause confusion. Read More »
Unlike many of my colleagues, I don’t have a negative view of the Gates Foundation’s education programs. Although I will admit that part of me is uneasy with the sheer amount of resources (and influence) they wield, and there are a few areas where I don’t see eye-to-eye with their ideas (or grantees), I agree with them on a great many things, and I think that some of their efforts, such as the Measuring Effective Teachers project, are important and beneficial (even if I found their packaging of the MET results a bit overblown).
But I feel obliged to say that I am particularly impressed with their recent announcement of support for a two-year delay on attaching stakes to the results of new assessments aligned with the Common Core. Granted, much of this is due to the fact that I think this is the correct policy decision (see my opinion piece with Morgan Polikoff). Independent of that, however, I think it took intellectual and political courage for them to take this stance, given their efforts toward new teacher evaluations that include test-based productivity measures.
The announcement was guaranteed to please almost nobody. Read More »
A recent report from the U.S. Department of Education presented a summary of three recent studies of the differences in the effectiveness of teaching provided advantaged and disadvantaged students (with the former defined in terms of value-added scores, and the latter in terms of subsidized lunch eligibility). The brief characterizes the results of these reports in an accessible manner – that the difference in estimated teaching effectiveness between advantaged and disadvantaged students varied quite widely between districts, but overall is about four percent of the achievement gap in reading and 2-3 percent in math.
Some observers were not impressed. They wondered why so-called reformers are alienating teachers and hurting students in order to address a mere 2-4 percent improvement in the achievement gap.
Just to be clear, the 2-4 percent figures describe the gap (and remember that it varies). Whether it can be narrowed or closed – e.g., by improving working conditions or offering incentives or some other means – is a separate issue. Nevertheless, let’s put aside all the substantive aspects surrounding these studies, and the issue of the distribution of teacher quality, and discuss this 2-4 percent thing, as it illustrates what I believe is the among the most important tensions underlying education policy today: Our collective failure to have a reasonable debate about expectations and the power of education policy. Read More »
In a post earlier this week, I noted how several state and local education leaders, advocates and especially the editorial boards of major newspapers used the results of the recently-released NAEP results inappropriately – i.e., to argue that recent reforms in states such as Tennessee and D.C. are “working.” I also discussed how this illustrates a larger phenomenon in which many people seem to expect education policies to generate immediate, measurable results in terms of aggregate student test scores, which I argued is both unrealistic and dangerous.
Mike G. from Boston, a friend whose comments I always appreciate, agrees with me, but asks a question that I think gets to the pragmatic heart of the matter. He wonders whether individuals in high-level education positions have any alternative. For instance, Mike asks, what would I suggest to Kevin Huffman, who is the head of Tennessee’s education department? Insofar as Huffman’s opponents “would use any data…to bash him if it’s trending down,” would I advise him to forego using the data in his favor when they show improvement?*
I have never held any important high-level leadership positions. My political experience and skills are (and I’m being charitable here) underdeveloped, and I have no doubt many more seasoned folks in education would disagree with me. But my answer is: Yes, I would advise him to forego using the data in this manner. Here’s why. Read More »
A couple of months ago, Bill Gates said something that received a lot of attention. With regard to his foundation’s education reform efforts, which focus most prominently on teacher evaluations, but encompass many other areas, he noted, “we don’t know if it will work.” In fact, according to Mr. Gates, “we won’t know for probably a decade.”
He’s absolutely correct. Most education policies, including (but not limited to) those geared toward shifting the distribution of teacher quality, take a long time to work (if they do work), and the research assessing these policies requires a great deal of patience. Yet so many of the most prominent figures in education policy routinely espouse the opposite viewpoint: Policies are expected to have an immediate, measurable impact (and their effects are assessed in the crudest manner imaginable).
A perfect example was the reaction to the recent release of results of the National Assessment of Educational Progress (NAEP). Read More »
Advocates of the so-called “Florida Formula,” a package of market-based reforms enacted throughout the 1990s and 2000s, some of which are now spreading rapidly in other states, traveled to Michigan this week to make their case to the state’s lawmakers, with particular emphasis on Florida’s school grading system. In addition to arguments about accessibility and parental involvement, their empirical (i.e., test-based) evidence consisted largely of the standard, invalid claims that cross-sectional NAEP increases prove the reforms’ effectiveness, along with a bonus appearance of the argument that since Florida starting grading schools, the grades have improved, even though this is largely (and demonstrably) a result of changes in the formula.
As mentioned in a previous post, I continue to be perplexed at advocates’ insistence on using this “evidence,” even though there is a decent amount of actual rigorous policy research available, much of it positive.
So, I thought it would be fun, though slightly strange, for me to try on my market-based reformer cap, and see what it would look like if this kind of testimony about the Florida reforms was actually research-based (at least the test-based evidence). Here’s a very rough outline of what I came up with: Read More »
Our guest author today is Morgan Polikoff, Assistant Professor in the Rossier School of Education at the University of Southern California.
A few weeks back, education policy wonks were hit with a set of opinion polls about education policy. The two most divergent of these polls were the Phi Delta Kappan/Gallup poll and the Associated Press/NORC poll.
This week a California poll conducted by Policy Analysis for California Education (PACE) and the USC Rossier School of Education (where I am an assistant professor) was released. The PACE/USC Rossier poll addresses many of the same issues as those from the PDK and AP, and I believe the three polls together can provide some valuable lessons about the education reform debate, the interpretation of poll results, and the state of popular opinion about key policy issues.
In general, the results as a whole indicate that parents and the public hold rather nuanced views on testing and evaluation. Read More »
I tend to comment on newly-released teacher surveys, primarily because I think the surveys are important and interesting, but also because teachers’ opinions are sometimes misrepresented in our debate about education reform. So, last year, I wrote about a report by the advocacy organization Teach Plus, in which they presented results from a survey focused on identifying differences in attitudes by teacher experience (an important topic). One of my major comments was that the survey was “non-scientific” – it was voluntary, and distributed via social media, e-mail, etc. This means that the results cannot be used to draw strong conclusions about the population of teachers as a whole, since those who responded might be different from those that did not.
I also noted that, even if the sample was not representative, this did not preclude finding useful information in the results. That is, my primary criticism was that the authors did not even mention the issue, or make an effort to compare the characteristics of their survey respondents with those of teachers in general (which can give a sense of the differences between the sample and the population).
Well, they have just issued a new report, which also presents the results of a teacher survey, this time focused on teachers’ attitudes toward the evaluation system used in Memphis, Tennessee (called the “Teacher Effectiveness Measure,” or TEM). In this case, not only do they raise the issue of representativeness, but they also present a little bit of data comparing their respondents to the population (i.e., all Memphis teachers who were evaluated under TEM). Read More »
One can often hear opponents of value-added referring to these methods as “junk science.” The term is meant to express the argument that value-added is unreliable and/or invalid, and that its scientific “façade” is without merit.
Now, I personally am not opposed to using these estimates in evaluations and other personnel policies, but I certainly understand opponents’ skepticism. For one thing, there are some states and districts in which design and implementation has been somewhat careless, and, in these situations, I very much share the skepticism. Moreover, the common argument that evaluations, in order to be “meaningful,” must consist of value-added measures in a heavily-weighted role (e.g., 45-50 percent) is, in my view, unsupportable.
All that said, calling value-added “junk science” completely obscures the important issues. The real questions here are less about the merits of the models per se than how they’re being used. Read More »
One of the more thoughtful voices in education, Larry Cuban, has delivered an interesting brief for the argument that there is no such thing as a “corporate reform movement.” While he acknowledges that America’s corporate elite largely share a view of how to reform America’s schools, focused on the creation of educational marketplaces and business-model schools as the engines of change, Cuban argues that it is mistake to overstate the homogeneity of perspectives and purposes. The power players of the reform movement have “varied, not uniform motives,” are “drawn from overlapping, but distinct spheres of influence,” and “vary in their aims and strategies.” The use of a term such as “corporate education reform” suggests “far more coherence and concerted action than occurs in the real world of politics and policymaking.”
Cuban’s argument amalgamates two different senses of the term “corporate education reform” – the notion that there is a movement for education reform led by corporate elites and the idea that there is a movement for education reform that seeks to remake public education in the image and likeness of for-profit corporations in a competitive marketplace.
In co-mingling these two distinct senses of the term, Cuban is adopting a common usage. And it is a usage not entirely without justification: many of the strongest advocates for transforming public schools into educational corporations are found in the corporate elite. But it is vital, I will argue here, that we separate these two conceptions of “corporate education reform” if we are to adequately understand the complexity of the political terrain on which the battles over the future of public education are being fought. Read More »
Controversial proposals for new teacher evaluation systems have generated a tremendous amount of misinformation. It has come from both “sides,” ranging from minor misunderstandings to gross inaccuracies. Ostensibly to address some of these misconceptions, the advocacy group Students First (SF) recently released a “myth/fact sheet” on evaluations.
Despite the need for oversimplification inherent in “myth/fact” sheets, the genre can be useful, especially about topics such as evaluation, about which there is much confusion. When advocacy groups produce them, however, the myths and facts sometimes take the form of “arguments we don’t like versus arguments we do like.”
This SF document falls into that trap. In fact, several of its claims are a little shocking. I would still like to discuss the sheet, not because I enjoy picking apart the work of others (I don’t), but rather because I think elements of both the “myths” and “facts” in this sheet could be recast as “dual myths” in a new sheet. That is, this document helps to illustrate how, in many of our most heated education debates, the polar opposite viewpoints that receive the most attention are often both incorrect, or at least severely overstated, and usually serve to preclude more productive, nuanced discussions.
Let’s take all four of SF’s “myth/fact” combinations in turn. Read More »
In a story for Education Week, always reliable Stephen Sawchuk reports on what may be a trend in states’ first results from their new teacher evaluation systems: The ratings are skewed toward the top.
For example, the article notes that, in Michigan, Florida and Georgia, a high proportion of teachers (more than 90 percent) received one of the two top ratings (out of four or five). This has led to some grumbling among advocates and others, citing similarities between these results and those of the old systems, in which the vast majority of teachers were rated “satisfactory,” and very few were found to be “unsatisfactory.”
Differentiation is very important in teacher evaluations – it’s kind of the whole point. Thus, it’s a problem when ratings are too heavily concentrated toward one end of the distribution. However, as Aaron Pallas points out, these important conversations about evaluation results sometimes seem less focused on good measurement or even the spread of teachers across categories than on the narrower question of how many teachers end up with the lowest rating – i.e., how many teachers will be fired.
Read More »
In a Slate article published last October, Daniel Engber bemoans the frequently shallow use of the classic warning that “correlation does not imply causation.” Mr. Engber argues that the correlation/causation distinction has become so overused in online comments sections and other public fora as to hinder real debate. He also posits that correlation does not mean causation, but “it sure as hell provides a hint,” and can “set us down the path toward thinking through the workings of reality.”
Correlations are extremely useful, in fact essential, for guiding all kinds of inquiry. And Engber is no doubt correct that the argument is overused in public debates, often in lieu of more substantive comments. But let’s also be clear about something – careless causal inferences likely do more damage to the quality and substance of policy debates on any given day than the misuse of the correlation/causation argument does over the course of months or even years.
We see this in education constantly. For example, mayors and superintendents often claim credit for marginal increases in testing results that coincide with their holding office. The causal leaps here are pretty stunning. Read More »
A recent article in Reuters, one that received a great deal of attention, sheds light on practices that some charter schools are using essentially to screen students who apply for admission. These policies include requiring long and difficult applications, family interviews, parental contracts, and even demonstrations of past academic performance.
It remains unclear how common these practices might be in the grand scheme of things, but regardless of how frequently they occur, most of these tactics are terrible, perhaps even illegal, and should be stopped. At the same time, there are two side points to keep in mind when you hear about charges such as these, as well as the accusations (and denials) of charter exclusion and segregation that tend to follow.
The first is that some degree of (self-)sorting and segregation of students by abilities, interests and other characteristics is part of the deal in a choice-based system. The second point is that screening and segregation are most certainly not unique to charter/private schools, and one primary reason is that there is, in a sense, already a lot of choice among regular public schools. Read More »
** Reprinted here in the Washington Post
In a recent post, Kevin Drum of Mother Jones discusses his growing skepticism about the research behind market-based education reform, and about the claims that supporters of these policies make. He cites a recent Los Angeles Times article, which discusses how, in 2000, the San Jose Unified School District in California instituted a so-called “high expectations” policy requiring all students to pass the courses necessary to attend state universities. The reported percentage of students passing these courses increased quickly, causing the district and many others to declare the policy a success. In 2005, Los Angeles Unified, the nation’s second largest district, adopted similar requirements.
For its part, the Times performed its own analysis, and found that the San Jose pass rate was actually no higher in 2011 compared with 2000 (actually, slightly lower for some subgroups), and that the district had overstated its early results by classifying students in a misleading manner. Mr. Drum, reviewing these results, concludes: “It turns out it was all a crock.”
In one sense, that’s true – the district seems to have reported misleading data. On the other hand, neither San Jose Unified’s original evidence (with or without the misclassification) nor the Times analysis is anywhere near sufficient for drawing conclusions – “crock”-based or otherwise – about the effects of this policy. This illustrates the deeper problem here, which is less about one “side” or the other misleading with research, but rather something much more difficult to address: Common misconceptions that impede deciphering good evidence from bad.
Read More »