Some of the best research out there is a product not of sophisticated statistical methods or complex research designs, but rather of painstaking manual data collection. A good example is a recent paper by Morgan Polikoff, Andrew McEachin, Stephani Wrabel and Matthew Duque, which was published in the latest issue of the journal Educational Researcher.
Polikoff and his colleagues performed a task that makes most of the rest of us cringe: They read and coded every one of the over 40 state applications for ESEA flexibility, or “waivers.” The end product is a simple but highly useful presentation of the measures states are using to identify “priority” (low-performing) and “focus” (schools “contributing to achievement gaps”) schools. The results are disturbing to anyone who believes that strong measurement should guide educational decisions.
There’s plenty of great data and discussion in the paper, but consider just one central finding: How states are identifying priority (i.e., lowest-performing) schools at the elementary level (the measures are of course a bit different for secondary schools). Read More »
** Reprinted here in the Core Knowledge Blog
How much do preschoolers from disadvantaged and more affluent backgrounds know about the world and why does that matter? One recent study by Tanya Kaefer (Lakehead University) Susan B. Neuman (New York University) and Ashley M. Pinkham (University of Michigan) provides some answers.
The researchers randomly selected children from preschool classrooms in two sites, one serving kids from disadvantaged backgrounds, the other serving middle-class kids. They then set about to answer three questions: Read More »
A new working paper, published by the National Bureau of Economic Research, is the first high quality assessment of one of the new teacher evaluation systems sweeping across the nation. The study, by Thomas Dee and James Wyckoff, both highly respected economists, focuses on the first three years of IMPACT, the evaluation system put into place in the District of Columbia Public Schools in 2009.
Under IMPACT, each teacher receives a point total based on a combination of test-based and non-test-based measures (the formula varies between teachers who are and are not in tested grades/subjects). These point totals are then sorted into one of four categories – highly effective, effective, minimally effective and ineffective. Teachers who receive a highly effective (HE) rating are eligible for salary increases, whereas teachers rated ineffective are dismissed immediately and those receiving minimally effective (ME) for two consecutive years can also be terminated. The design of this study exploits that incentive structure by, put very simply, comparing the teachers who were directly above the ME and HE thresholds to those who were directly below them, and to see whether they differed in terms of retention and performance from those who were not. The basic idea is that these teachers are all very similar in terms of their measured performance, so any differences in outcomes can be (cautiously) attributed to the system’s incentives.
The short answer is that there were meaningful differences. Read More »
In a new NBER working paper, economist Derek Neal makes an important point, one of which many people in education are aware, but is infrequently reflected in actual policy. The point is that using the same assessment to measure both student and teacher performance often contaminates the results for both purposes.
In fact, as Neal notes, some of the very features required to measure student performance are the ones that make possible the contamination when the tests are used in high-stakes accountability systems. Consider, for example, a situation in which a state or district wants to compare the test scores of a cohort of fourth graders in one year with those of fourth graders the next year. One common means of facilitating this comparability is administering some of the questions to both groups (or to some “pilot” sample of students prior to those being tested). Otherwise, any difference in scores between the two cohorts might simply be due to differences in the difficulty of the questions. If you cannot check that out, it’s tough to make meaningful comparisons.
But it’s precisely this need to repeat questions that enables one form of so-called “teaching to the test,” in which administrators and educators use questions from prior assessments to guide their instruction for the current year. Read More »
In education today, data, particularly testing data, are everywhere. One of many potentially valuable uses of these data is helping teachers improve instruction – e.g., identifying students’ strengths and weaknesses, etc. Of course, this positive impact depends on the quality of the data and how it is presented to educators, among other factors. But there’s an even more basic requirement – teachers actually have to use it.
In an article published in the latest issue of the journal Education Finance and Policy, economist John Tyler takes a thorough look at teachers’ use of an online data system in a mid-sized urban district between 2008 and 2010. A few years prior, this district invested heavily in benchmark formative assessments (four per year) for students in grades 3-8, and an online “dashboard” system to go along with them. The assessments’ results are fed into the system in a timely manner. The basic idea is to give these teachers a continual stream of information, past and present, about their students’ performance.
Tyler uses weblogs from the district, as well as focus groups with teachers, to examine the extent and nature of teachers’ data usage (as well as a few other things, such as the relationship between usage and value-added). What he finds is not particularly heartening. In short, teachers didn’t really use the data. Read More »
Education researchers have paid a lot of attention to the sorting of teachers across schools. For example, it is well known that schools serving more low-income students tend to employ teachers who are, on average, less qualified (in terms of experience, degree, certification, etc.; also see here).
Far less well-researched, however, is the issue of sorting within schools – for example, whether teachers with certain characteristics are assigned to classes with different students than their colleagues in the same school. In addition to the obvious fact that which teachers are in front of which students every day is important, this question bears on a few major issues in education policy today. For example, there is evidence that teacher turnover is influenced by the characteristics of the students teachers teach, which means that classroom assignments might either exacerbate or mitigate mobility and attrition. In addition, teacher productivity measures such as value-added may be affected by the sorting of students into classes based on characteristics for which the models do not account, and a better understanding of the teacher/student matching process could help inform this issue.
A recent article, which was published in the journal Sociology of Education, sheds light on these topics with a very interesting look at the distribution of students across teachers’ classrooms in Miami-Dade between 2003-04 and 2010-11. The authors’ primary question is: Are certain characteristics, most notably race/ethnicity, gender, experience, or pre-service qualifications (e.g., SAT scores), associated with assignment to higher or lower-scoring students among teachers in the same school, grade, and year? Read More »
** Reprinted here in the Washington Post
A big part of successful policy making is unyielding attention to detail (an argument that regular readers of this blog hear often). Choices about design and implementation that may seem unimportant can play a substantial role in determining how policies play out in practice.
A new paper, co-authored by Elizabeth Davidson, Randall Reback, Jonah Rockoff and Heather Schwartz, and presented at last month’s annual conference of The Association for Education Finance and Policy, illustrates this principle vividly, and on a grand scale: With an analysis of outcomes in all 50 states during the early years of NCLB.
After a terrific summary of the law’s rules and implementation challenges, as well as some quick descriptive statistics, the paper’s main analysis is a straightforward examination of why the proportion of schools meeting AYP varied quite a bit between states. For instance, in 2003, the first year of results, 32 percent of U.S. schools failed to make AYP, but the proportion ranged from one percent in Iowa to over 80 percent in Florida.
Surprisingly, the results suggest that the primary reasons for this variation seem to have had little to do with differences in student performance. Rather, the big factors are subtle differences in rather arcane rules that each state chose during the implementation process. These decisions received little attention, yet they had a dramatic impact on the outcomes of NCLB during this time period. Read More »
Charter schools, though they comprise a remarkably diverse sector, are quite often subject to broad generalizations. Opponents, for example, promote the characterization of charters as test prep factories, though this is a sweeping claim without empirical support. Another common stereotype is that charter schools exclude students with special needs. It is often (but not always) true that charters serve disproportionately fewer students with disabilities, but the reasons for this are complicated and vary a great deal, and there is certainly no evidence for asserting a widespread campaign of exclusion.
Of course, these types of characterizations, which are also leveled frequently at regular public schools, don’t always take the form of criticism. For instance, it is an article of faith among many charter supporters that these schools, thanks to the fact that relatively few are unionized, are better able to aggressively identify and fire low-performing teachers (and, perhaps, retain high performers). Unlike many of the generalizations from both “sides,” this one is a bit more amenable to empirical testing.
A recent paper by Joshua Cowen and Marcus Winters, published in the journal Education Finance and Policy, is among the first to take a look, and some of the results might be surprising. Read More »
Drawing on a half century of empirical evidence, as well as new data and analysis, a team of scholars has challenged the substance of many of the attacks on public employees and their unions –urging political leaders and the research community to take this “transformational” moment in the divisive and ideologically driven debate over the role of government and the value of public services to deepen their commitment to evidence-based policy ideas.
These arguments were outlined in “The Great New Debate about Unionism and Collective Bargaining in U.S. State and Local Governments,” published by Cornell University’s ILR Review. The authors – David Lewin (UCLA), Jeffrey Keefe (Rutgers), and Thomas Kochan (MIT) – point out that, with half a century of experience, there is now a wealth of data by which to evaluate public sector unionism and its effects.
In that context, the authors spell out the history, arguments and empirical findings on three key issues: 1) Are public employees overpaid?; 2) Do labor-management dispute resolution procedures, which are part of many state and local government collective bargaining laws, enhance or hinder effective governance?; 3) Have unions and managers in the public sector demonstrated the ability to respond constructively to fiscal crises? Read More »
In education policy debates, we like the “big picture.” We love to say things like “hold schools accountable” and “set high expectations.” Much less frequent are substantive discussions about the details of accountability systems, but it’s these details that make or break policy. The technical specs just aren’t that sexy. But even the best ideas with the sexiest catchphrases won’t improve things a bit unless they’re designed and executed well.
In this vein, I want to recommend a very interesting CALDER working paper by Mark Ehlert, Cory Koedel, Eric Parsons and Michael Podgursky. The paper takes a quick look at one of these extremely important, yet frequently under-discussed details in school (and teacher) accountability systems: The choice of growth model.
When value-added or other growth models come up in our debates, they’re usually discussed en masse, as if they’re all the same. They’re not. It’s well-known (though perhaps overstated) that different models can, in many cases, lead to different conclusions for the same school or teacher. This paper, which focuses on school-level models but might easily be extended to teacher evaluations as well, helps illustrate this point in a policy-relevant manner.
Read More »
** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post
The New Teacher Project (TNTP) has a new, highly-publicized report about what it calls “irreplaceables,” a catchy term that is supposed to describe those teachers who are “so successful they are nearly impossible to replace.” The report’s primary conclusion is that these “irreplaceable” teachers often leave the profession voluntarily, and TNTP offers several recommendations for how to improve this.
I’m not going to discuss this report fully. It shines a light on teacher retention, which is a good thing. Its primary purpose is to promulgate the conceptual argument that not all teacher turnover is created equal – i.e., that it depends on whether “good” or “bad” teachers are leaving (see here for a strong analysis on this topic). The report’s recommendations are standard fare – improve working conditions, tailor pay to “performance” (see here for a review of evidence on incentives and retention), etc. Many are widely-supported, while others are more controversial. All of them merit discussion.
I just want to make one quick (and, in many respects, semantic) point about the manner in which TNTP identifies high-performing teachers, as I think it illustrates larger issues. In my view, the term “irreplaceable” doesn’t apply, and I think it would have been a better analysis without it. Read More »
Economist Jesse Rothstein recently released a working paper about which I am compelled to write, as it speaks directly to so many of the issues that we have raised here over the past year or two. The purpose of Rothstein’s analysis is to move beyond the talking points about teaching quality in order to see if strategies that have been proposed for improving it might yield benefits. In particular, he examines two labor market-oriented policies: performance pay and dismissing teachers.
Both strategies are, at their cores, focused on selection (and deselection) – in other words, attracting and retaining higher-performing candidates and exiting, directly or indirectly, lower-performing incumbents. Both also take time to work and have yet to be experimented with systematically in most places; thus, there is relatively little evidence on the long-term effects of either.
Rothstein’s approach is to model this complex dynamic, specifically the labor market behavior of teachers under these policies (i.e., choosing, leaving and staying in teaching), which is often ignored or assumed away, despite the fact that it is so fundamental to the policies themselves. He then calculates what would happen under this model as a result of performance pay and dismissal policies – that is, how they would affect the teacher labor market and, ultimately, student performance.*
Of course, this is just a simulation, and must be (carefully) interpreted as such, but I think the approach and findings help shed light on three fundamental points about education reform in the U.S. Read More »
The majority of social science research does not explicitly dwell on how we go from situation A to situation B. Instead, most social scientists focus on associations between different outcomes. This “static” approach has advantages but also limitations. Looking at associations might reveal that teachers who experience condition A are twice as likely to leave their schools than teachers who experience condition B. But what does this knowledge tell us about how to move from condition A to condition B? In many cases, very little.
Many social science findings are not easily “actionable” for policy purposes precisely because they say nothing about processes or sequences of events and activities unfolding over time, and in context. While conventional quantitative research provides indications of what works — on average — across large samples, a look at processes reveals how factors or events (situated in time and space) are associated with each other. This kind of research provides the detail that we need, not just to understand the world, but to do so in a way that is useful and enables us to act on it constructively.
Although this kind of work is rare, every now then a quantitative study showing “process sensitivity” sees the light of day. This is the case of a recent paper by Morgan and colleagues (2010) examining how the events that teachers experience routinely affect their commitment to remain in the profession. Read More »
There’s a fairly large body of research showing that charter schools vary widely in test-based performance relative to regular public schools, both by location as well as subgroup. Yet, you’ll often hear people point out that the highest-quality evidence suggests otherwise (see here, here and here) – i.e., that there are a handful of studies using experimental methods (randomized controlled trials, or RCTs) and these analyses generally find stronger, more uniform positive charter impacts.
Sometimes, this argument is used to imply that the evidence, as a whole, clearly favors charters, and, perhaps by extension, that many of the rigorous non-experimental charter studies – those using sophisticated techniques to control for differences between students – would lead to different conclusions were they RCTs.*
Though these latter assertions are based on a valid point about the power of experimental studies (the few of which we have are often ignored in the debate over charters), they are dubiously overstated for a couple of reasons, discussed below. But a new report from the (indispensable) organization Mathematica addresses the issue head on, by directly comparing estimates of charter school effects that come from an experimental analysis with those from non-experimental analyses of the same group of schools.
The researchers find that there are differences in the results, but many are not statistically significant and those that are don’t usually alter the conclusions. This is an important (and somewhat rare) study, one that does not, of course, settle the issue, but does provide some additional tentative support for the use of strong non-experimental charter research in policy decisions.
Read More »
In a previous post, I discussed the idea of “attracting the best candidates” to teaching by reviewing the research on the association between pre-service characteristics and future performance (usually defined in terms of teachers’ estimated effect on test scores once they get into the classroom). In general, this body of work indicates that, while far from futile, it’s extremely difficult to predict who will be an “effective” teacher based on their paper traits, including those that are typically used to define “top candidates,” such as the selectivity of the undergraduate institutions they attend, certification test scores and GPA (see here, here, here and here, for examples).
There is some very limited evidence that other, “non-traditional” measures might help. For example, a working paper, released last year, found a statistically discernible, fairly strong association between first-year math value-added and an index constructed from surveys administered to Teach for America candidates. There was, however, no association in reading (note that the sample was small), and no relationships in either subject found during these teachers’ second years.*
A recently-published paper – which appears in the peer-reviewed journal Education Finance and Policy, originally released as working paper in 2008 – represents another step forward in this area. The analysis, presented by the respected quartet of Jonah Rockoff, Brian Jacob, Thomas Kane, and Douglas Staiger (RJKS), attempts to look beyond the set of characteristics that researchers are typically constrained (by data availability) to examine.
In short, the results do reveal some meaningful, potentially policy-relevant associations between pre-service characteristics and future outcomes. From a more general perspective, however, they are also a testament to the difficulties inherent in predicting who will be a good teacher based on observable traits. Read More »