The conventional wisdom is that Americans are becoming more tolerant over time. One of the common ways to measure this tolerance is to ask survey respondents whether they would be willing to have members of different groups – for example, people with different ethnicities, religions, sexual orientations, etc. – serve in positions of societal importance or trust, such as President, family doctor, or, of course, teacher.
Granted, people are not always forthcoming when asked sensitive questions of this sort, and one should always regard the distribution of responses with caution. That said, from an educational perspective, it might be interesting to take a look at Americans’ stated views about whether members of different groups should be allowed to teach, particularly whether and how these opinions have changed over time.
The General Social Survey includes several questions about who should be allowed to teach in a college or university, and the survey has asked these questions since 1972. We’ll start with four questions that are worded as follows: “There are always some people whose ideas are considered bad or dangerous by other people. For instance, somebody who is X. Should such a person be allowed to teach in a college or university, or not?” Read More »
Every four years, the National Center for Education Statistics provides the public with the best available national estimates of teacher attrition and mobility. The estimates come from the Teacher Follow-Up Survey (TFS), which is a supplement to the Schools and Staffing Survey (SASS), a much larger national survey of teachers that is also conducted every four years. Put simply, the TFS is a sub-sample of SASS respondents, who are contacted the following year to find out if and where they are still teaching.
The conventional wisdom among many commentators, particularly those critical of test-based accountability and recent education reform, is that teacher attrition (teachers leaving the profession) and mobility (teachers switching schools) are on the rise. As discussed in a previous post, this was indeed the case, at least at the national level, between the 1991-92 and 2004-05 school years, but ceased to be true between 2004-05 and 2008-09, during which time attrition and mobility was basically flat. A few months ago, results from the latest administration of the TFS, which tracked teachers between 2011-12 and 2012-13, were released, and it’s worth taking a quick look at the findings.
As you can see in the graph below, the proportion of public school teachers who left the profession entirely (“leavers”), as well as the proportion who switched schools (“movers”), were again relatively flat between 2008-09 and 2012-13 (and the change is not statistically significant). Read More »
One of the major considerations in designing accountability policy, whether in education or other fields, is what you might call accessibility. That is, both the indicators used to construct measures and how they are calculated should be reasonably easy for stakeholders to understand, particularly if the measures are used in high-stakes decisions.
This important consideration also generates great tension. For example, complaints that Florida’s school rating system is “too complicated” have prompted legislators to make changes over the years. Similarly, other tools – such as procedures for scoring and establishing cut points for standardized tests, and particularly the use of value-added models – are routinely criticized as too complex for educators and other stakeholders to understand. There is an implicit argument underlying these complaints: If people can’t understand a measure, it should not be used to hold them accountable for their work. Supporters of using these complex accountability measures, on the other hand, contend that it’s more important for the measures to be “accurate” than easy to understand.
I personally am a bit torn. Given the extreme importance of accountability systems’ credibility among those subject to them, not to mention the fact that performance evaluations must transmit accessible and useful information in order to generate improvements, there is no doubt that overly complex measures can pose a serious problem for accountability systems. It might be difficult for practitioners to adjust their practice based on a measure if they don’t understand that measure, and/or if they are unconvinced that the measure is transmitting meaningful information. And yet, the fact remains that measuring the performance of schools and individuals is extremely difficult, and simplistic measures are, more often than not, inadequate for these purposes. Read More »
Teachers in China are joining other workers in protesting their compensation and working conditions, reports the China Labour Bulletin (CLB), a workers rights-monitoring and research group founded in Hong Kong in 1994 (CLB’s executive director, Han Dongfang, is a member of the Shanker Institute board of directors).
Throughout the past three months there have been at least 30 strikes by Chinese teachers. In the map below, which is taken from the CLB article, the numbers are strike frequencies. Many of them occurred in smaller cities and higher-poverty inland areas. For example, last month, over 20,000 teachers went on strike in cities and districts surrounding Harbin, the capital of the northeastern province of Heilongjiang.
The article notes that low (and/or unpaid) salaries are a recurrent theme in the protests, but there are a couple of other issues on the table that may sound familiar to those who follow U.S. education policy. Read More »
The District of Columbia Public Charter School Board (PCSB) recently released the 2014 results of their “Performance Management Framework” (PMF), which is the rating system that the PCSB uses for its schools.
Very quick background: This system sorts schools into one of three “tiers,” with Tier 1 being the highest-performing, as measured by the system, and Tier 3 being the lowest. The ratings are based on a weighted combination of four types of factors — progress, achievement, gateway, and leading — which are described in detail in the first footnote.* As discussed in a previous post, the PCSB system, in my opinion, is better than many others out there, since growth measures play a fairly prominent role in the ratings, and, as a result, the final scores are only moderately correlated with key student characteristics such as subsidized lunch eligibility.** In addition, the PCSB is quite diligent about making the PMF results accessible to parents and other stakeholders, and, for the record, I have found the staff very open to sharing data and answering questions.
That said, PCSB’s big message this year was that schools’ ratings are improving over time, and that, as a result, a substantially larger proportion of DC charter students are attending top-rated schools. This was reported uncritically by several media outlets, including this story in the Washington Post. It is also based on a somewhat questionable use of the data. Let’s take a very simple look at the PMF dataset, first to examine this claim and then, more importantly, to see what we can learn about the PMF and DC charter schools in 2013 and 2014. Read More »
So-called achievement gaps – the differences in average test performance among student subgroups, usually defined in terms of ethnicity or income – are important measures. They demonstrate persistent inequality of educational outcomes and economic opportunities between different members of our society.
So long as these gaps remain, it means that historically lower-performing subgroups (e.g., low-income students or ethnic minorities) are less likely to gain access to higher education, good jobs, and political voice. We should monitor these gaps; try to identify all the factors that affect them, for good and for ill; and endeavor to narrow them using every appropriate policy lever – both inside and outside of the educational system.
Achievement gaps have also, however, taken on a very different role over the past 10 or so years. The sizes of gaps, and extent of “gap closing,” are routinely used by reporters and advocates to judge the performance of schools, school districts, and states. In addition, gaps and gap trends are employed directly in formal accountability systems (e.g., states’ school grading systems), in which they are conceptualized as performance measures.
Although simple measures of the magnitude of or changes in achievement gaps are potentially very useful in several different contexts, they are poor gauges of school performance, and shouldn’t be the basis for high-stakes rewards and punishments in any accountability system. Read More »
A few weeks ago, the Minneapolis Star Tribune published teacher evaluation results for the district’s public school teachers in 2013-14. This decision generated a fair amount of controversy, but it’s worth noting that the Tribune, unlike the Los Angeles Times and New York City newspapers a few years ago, did not publish scores for individual teachers, only totals by school.
The data once again provide an opportunity to take a look at how results vary by student characteristics. This was indeed the focus of the Tribune’s story, which included the following headline: “Minneapolis’ worst teachers are in the poorest schools, data show.” These types of conclusions, which simply take the results of new evaluations at face value, have characterized the discussion since the first new systems came online. Though understandable, they are also frustrating and a potential impediment to the policy process. At this early point, “the city’s teachers with the lowest evaluation ratings” is not the same thing as “the city’s worst teachers.” Actually, as discussed in a previous post, the systematic variation in evaluation results by student characteristics, which the Tribune uses to draw conclusions about the distribution of the city’s “worst teachers,” could just as easily be viewed as one of the many ways that one might assess the properties and even the validity of those results.
So, while there are no clear-cut “right” or “wrong” answers here, let’s take a quick look at the data and what they might tell us. Read More »
The State of Florida is currently engaged in a policy tussle of sorts with the U.S. Department of Education (USED) over Florida’s accountability system. To make a long story short, last spring, Florida passed a law saying that the test scores of English language learners (ELLs) would only count toward schools’ accountability grades (and teacher evaluations) once the ELL students had been in the system for at least two years. This runs up against federal law, which requires that ELLs’ scores be counted after only one year, and USED has indicated that it’s not willing to budge on this requirement. In response, Florida is considering legal action.
This conflict might seem incredibly inane (unless you’re in one of the affected schools, of course). Beneath the surface, though, this is actually kind of an amazing story.
Put simply, Florida’s argument against USED’s policy of counting ELL scores after just one year is a perfect example of the reason why most of the state’s core accountability measures (not to mention those of NCLB as a whole) are so inappropriate: Because they judge schools’ performance based largely on where their students’ scores end up without paying any attention to where they start out. Read More »
A new Mathematica report examines the test-based impact of The Equity Project (TEP), a New York City charter school serving grades 5-8. TEP opened up for the 2009-10 school year, receiving national attention mostly due to one unusual policy: They paid teachers $125,000 per year, regardless of experience and education, in addition to annual bonuses (up to $25,000) for returning teachers. TEP largely makes up for these unusually high salary costs by minimizing the number of administrators and maintaining larger class sizes.
As is typical of Mathematica, the TEP analysis is thorough and well-done. The school’s students’ performance is compared to that of similar peers with a comparable probability of enrolling in TEP, as identified with propensity scores. In general, the study’s results were quite positive. Although there were statistically discernible negative impacts of attendance for TEP’s first cohort of students during their first two years, the cumulative estimated test-based impact was significant, positive and educationally meaningful after three and four years of attendance. As always, the estimated effect was stronger in math than in reading (estimated effect sizes for the former were very large in magnitude). The Mathematica researchers also present analyses on student attrition, which did not appear to bias the estimates substantially, and they also show that their primary results are robust when using alternative specifications (e.g., different matching techniques, score transformations, etc.).
Now we get to the tricky questions about these results: What caused them and what can be learned as a result? That’s the big issue with charter analyses in general (and with research on many other interventions): One can almost never separate the “why” from the “what” with any degree of confidence. And TEP, with its “flagship policy” of high teacher salaries, which might appeal to all “sides” in the education policy debate, provides an interesting example in this respect. Read More »
The College Board recently released the latest SAT results, for the first time combining this release with that of data from the PSAT and AP exams. The release of these data generated the usual stream of news coverage, much of which misinterpreted the year-to-year changes in SAT scores as a lack of improvement, even though the data are cross-sectional and the test-taking sample has been changing, and/or misinterpreted the percent of test takers who scored above the “college ready” line as a national measure of college readiness, even though the tests are not administered to a representative sample of students.
It is disheartening to watch this annual exercise, in which the most common “take home” headlines (e.g., “no progress in SAT scores” and “more, different students take SAT”) are in many important respects contradictory. In past years, much of the blame had to be placed on the College Board’s presentation of the data. This year, to their credit, the roll-out is substantially better (hopefully, this will continue).
But I don’t want to focus on this aspect of the organization’s activities (see this post for more); instead, I would like to discuss briefly the College Board’s recent change in mission. Read More »
The Foundation for Excellence in Education, an organization that advocates for education reform in Florida, in particular the set of policies sometimes called the “Florida Formula,” recently announced a competition to redesign the “appearance, presentation and usability” of the state’s school report cards. Winners of the competition will share prize money totaling $35,000.
The contest seems like a great idea. Improving the manner in which education data are presented is, of course, a laudable goal, and an open competition could potentially attract a diverse group of talented people. As regular readers of this blog know, however, I am not opposed to sensibly-designed test-based accountability policies, but my primary concern about school rating systems is focused mostly on the quality and interpretation of the measures used therein. So, while I support the idea of a competition for improving the design of the report cards, I am hoping that the end result won’t just be a very attractive, clever instrument devoted to the misinterpretation of testing data.
In this spirit, I would like to submit four simple graphs that illustrate, as clearly as possible and using the latest data from 2014, what Florida’s school grades are actually telling us. Since the scoring and measures vary a bit between different types of schools, let’s focus on elementary schools. Read More »
There’s no reason why insisting on proper causal inference can’t be fun.
A weeks ago, ASCD published a policy brief (thanks to Chad Aldeman for flagging it), the purpose of which is to argue that it is “grossly misleading” to make a “direct connection” between nations’ test scores and their economic strength.
On the one hand, it’s implausible to assert that better educated nations aren’t stronger economically. On the other hand, I can certainly respect the argument that test scores are an imperfect, incomplete measure, and the doomsday rhetoric can sometimes get out of control.
In any case, though, the primary piece of evidence put forth in the brief was the eye-catching graph below, which presented trends in NAEP versus those in U.S. GDP and productivity. Read More »
In observing all the recent controversy surrounding the Common Core State Standards (CCSS), I have noticed that one of the frequent criticisms from one of the anti-CCSS camps, particularly since the first rounds of results from CCSS-aligned tests have started to be released, is that the standards are going to be used to label more schools as “failing,” and thus ramp up the test-based accountability regime in U.S. public education.
As someone who is very receptive to a sensible, well-designed dose of test-based accountability, but sees so little of it in current policy, I am more than sympathetic to concerns about the proliferation and misuse of high-stakes testing. On the other hand, anti-CCSS arguments that focus on testing or testing results are not really arguments against the standards per se. They also strike me as ironic, as they are based on the same flawed assumptions that critics of high-stakes testing should be opposing.
Standards themselves are about students. They dictate what students should know at different points in their progression through the K-12 system. Testing whether students meet those standards makes sense, but how we use those test results is not dictated by the standards. Nor do standards require us to set bars for “proficient,” “advanced,” etc., using the tests. Read More »
One of the more visible manifestations of what I have called “informal test-based accountability” — that is, how testing results play out in the media and public discourse — is the phenomenon of superintendents, particularly big city superintendents, making their reputations based on the results during their administrations.
In general, big city superintendents are expected to promise large testing increases, and their success or failure is to no small extent judged on whether those promises are fulfilled. Several superintendents almost seem to have built entire careers on a few (misinterpreted) points in proficiency rates or NAEP scale scores. This particular phenomenon, in my view, is rather curious. For one thing, any district leader will tell you that many of their core duties, such as improving administrative efficiency, communicating with parents and the community, strengthening districts’ financial situation, etc., might have little or no impact on short-term testing gains. In addition, even those policies that do have such an impact often take many years to show up in aggregate results.
In short, judging superintendents based largely on the testing results during their tenures seems misguided. A recent report issued by the Brown Center at Brookings, and written by Matt Chingos, Grover Whitehurst and Katharine Lindquist, adds a little bit of empirical insight to this viewpoint. Read More »
In the most simplistic portrayal of the education policy landscape, one of the “sides” is a group of people who are referred to as “reformers.” Though far from monolithic, these people tend to advocate for test-based accountability, charters/choice, overhauling teacher personnel rules, and other related policies, with a particular focus on high expectations, competition and measurement. They also frequently see themselves as in opposition to teachers’ unions.
Most of the “reformers” I have met and spoken with are not quite so easy to categorize. They are also thoughtful and open to dialogue, even when we disagree. And, at least in my experience, there is far more common ground than one might expect.
Nevertheless, I believe that this “movement” (to whatever degree you can characterize it in those terms) may be doomed to stall out in the long run, not because their ideas are all bad, and certainly not because they lack the political skills and resources to get their policies enacted. Rather, they risk failure for a simple reason: They too often make promises that they cannot keep. Read More »