Championing Fact-Challenged Facts

The New Teacher Project and Students First have recently posted/cross-posted one of the more impressively fact-challenged manifestos I’ve encountered.

The core argument in this recent post is that the facts on education reform speak for themselves and that the facts, as they describe them, simply need a champion – someone to make the public aware of these undeniable facts. However, the dreaded and evil teachers’ union, and its stranglehold over the media and public opinion is dead set on obfuscating the undeniable facts about the effectiveness of recent education reforms. As they put it:

The reality is that while unions and their allies have the motivation, discipline and resources to get their messages out and repeat them endlessly, the facts have no champion.

So then, what are these supposed facts that the teachers’ union has so successfully obfuscated?

 The Facts According to TNTP/SF: U.S. Failure on PISA

According to TNTP and SF…

There’s no disputing that the results are pretty dismal—15-year-olds in the United States ranked 30th in math, 23rd in science and 20th in reading among participating industrialized countries. But the conversation about the PISA results was just as depressing.

Hayes argued that these results were a reflection of income inequality, not the poor quality of our schools, that we rank near the bottom because we have “so many test takers from the bottom of the social class distribution.” It’s a ridiculous assertion, and one that is easily disproved by a close look at the data, which compare the performance of students with similar socio-economic backgrounds around the globe. The wealthiest American 15-year-olds, for example—those in the top socio-economic quartile—rank 26th in math compared to their affluent peers elsewhere. In other words, poverty does not explain the poor performance of our K-12 education system. (Amanda Ripley has more on this, which you can read here.)

That’s right… no disputing. We all know it. It’s a simple fact. U.S. Schools stink when compared on simple rankings to other countries… and this stinkiness can be attributed to bad teaching, limited choice and unions, of course. Okay… they didn’t say that… but it does seem implied by the fact that their blog post blames unions and Randi Weingarten specifically for denying the facts and creating false public messages. Most importantly, Amanda Ripley, quantitative researcher extraordinaire, proves that poverty has nothing to do with our massive failure!

What do we actually know about U.S. Performance on PISA?

Here’s what I wrote back on PISA day!

With today’s release of PISA data it is once again time for wild punditry, mass condemnation of U.S. public schools and a renewed sense of urgency to ram through ill-conceived, destructive policies that will make our school system even more different from those breaking the curve on PISA.

With that out of the way, here’s my little graphic contribution to what has become affectionately known to edu-pundit class as PISA-Palooza.  Yep… it’s the ol’ poverty as an excuse graph – well, really it’s just the ol’ poverty in the aggregate just so happens to be pretty strongly associated with test scores in the aggregate – graph… but that’s nowhere near as catchy.


PISA Data: (table M4)

OECD Relative Poverty: Source: Provisional data from OECD Income distribution and poverty database (

Yep – that’s right… relative poverty – or the share of children in families below 50% of median income – is reasonably strongly associated with Math Literacy PISA scores. And this isn’t even a particularly good measure of actual economic deprivation. Rather, it’s the measure commonly used by OECD and readily available. Nonetheless, at the national aggregate, it serves as a pretty strong correlate of national average performance on PISA.

What our little graph tells us – albeit not really that meaningful – is that if we account (albeit poorly) for child poverty, the U.S. is actually beating the odds. Way to go? (but for that really high poverty rate).

Bottom line – economic conditions matter and simple rankings of countries by their PISA scores aren’t particularly insightful (and the above graph only marginally more insightful). Further, comparisons of cities in China to entire nations is a particularly silly approach.

But then how does one explain away Amanda Ripley’s supposed brilliant rebuttal of the poverty concern? Note that she points to a table of how children in the top quartile within the United States according to an OECD socioeconomic index compare to children in the top quartile within other countries. This is a major math/logic fail on the part of Ripley and others interpreting these data. You see, the top quarter within a poorer country is, well, poorer than the top quarter within a richer country. So really, the above graph still applies.

But to illustrate my point, here are the countries – and Chinese Cities… and Singapore (hardly a relevant comparison) – ranked by math score, including some specific U.S. States. The top quarter of students in a “richer” U.S. state (because the top quarter among the rich are richer than the top quarter among the poor) seem to do pretty darn well… with Massachusetts being beaten only by Korea (along with select Chinese cities and Singapore – hardly relevant comparisons).  Of course, referring to these comparisons as comparing the wealthy, or affluent in one country to the wealthy, or affluent in another is offensive enough to begin with. It’s all relative.

Slide1So, NO… the scores of our top quarter falling behind those in the top quarter of other nations does NOT by any means contradict the finding that poverty matters. In fact, breaking out U.S. States of varied poverty levels and ranking them among countries in this very graph provides additional support the economic context remains the primary driver of jurisdictional aggregate test score comparisons (or maybe these scores prove that Florida’s education reforms are a dreadful failure?).

The Facts According to TNTP/SF: Test Based School Closures Improve Outcomes!

This particular quote is truly baffling, since the linked study provides no support for the actual claim made in the quote – that policies such as closing failing schools based on test-score based accountability is leading to performance gains.

And research also shows that these gains were not achieved through happenstance. They were caused, in part, by the very policies Randi decries, such as closing failing schools based on test-score-based accountability systems.

What does the linked study actually say?

The MDRC study linked above focused on the longer term outcomes of students attending small high schools in New York City. While it may be the case that some students migrated to these small high schools after having their larger neighborhood high schools closed, for any number of reasons including test-based accountability, that was not the emphasis of the study. As stated in the study summary itself, here are the findings:

  • Small high schools in New York City continue to markedly increase high school graduation rates for large numbers of disadvantaged students of color, even as graduation rates are rising at the schools with which SSCs are compared. For the full sample, students at small high schools have a graduation rate of 70.4 percent, compared with 60.9 percent for comparable students at other New York City high schools.
  • The best evidence that exists indicates that small high schools may increase graduation rates for two new subgroups for which findings were not previously available: special education students and English language learners. However, given the still-limited sample sizes for these subgroups, the evidence will not be definitive until more student cohorts can be added to the analysis.
  • Principals and teachers at the 25 small high schools with the strongest evidence of effectiveness strongly believe that academic rigor and personal relationships with students contribute to the effectiveness of their schools. They also believe that these attributes derive from their schools’ small organizational structures and from their committed, knowledgeable, hardworking, and adaptable teachers.

The Facts According to TNTP/SF: DC & Tennessee NAEP Gains!

And finally, here’s one I’ve blogged about more than once in recent months – the bold and completely unfounded claim that NAEP gains in Washington DC and Tennessee provide proof positive of the value of recent “reforms” toward improving student outcomes.

So why the cognitive dissonance? While no one should be declaring victory based on these results (a large majority of kids in New York still do NOT graduate college-ready), you might expect that the city’s results (and the most recent NAEP results, which show similarly impressive gains in Washington, D.C. and Tennessee) would give Weingarten and like-minded stakeholders some pause before they continue to issue blanket indictments of the reform agenda.

And about that claim of DC & Tennessee “impressive” gains?

As I explain in my recent post, for these latest findings to actually validate that teacher evaluation and/or other favored policies are “working” to improve student outcomes, two empirically supportable conditions would have to exist.

  • First, that the gains in NAEP scores have actually occurred – changed their trajectory substantively – SINCE implementation of these reforms.
  • Second, that the gains achieved by states implementing these policies are substantively different from the gains of states not implementing similar policies, all else equal.

And neither claim is true, as I explain more thoroughly here! But here’s a quick graphic run down.

First, major gains in DC actually started long before recent evaluation reforms, whether we are talking about common core adoption or DC IMPACT. In fact, the growth trajectory really doesn’t change much in recent years.  But hey, assertions of retro-active causation are actually more common than one might expect!

 Figure 11


Note also that DC has experienced demographic change over time, an actual decline in cohort poverty rates over time and that these supposed score changes over time are actually simply score differences from one cohort to the next. This is not to downplay the gains, but rather to suggest that it’s rather foolish to assert that policies of the past few years have caused them.

Second, comparing cohort achievement gains (adjusted for initial scores… since lower scoring states have higher average gains on NAEP) with STUDENTS FIRST’s own measures of “reformyness” we see first that DC and TN really aren’t standouts, that other reformy states actually did quite poorly (states on the right hand side of the graphs that fall below the red line), and many non-reformy states like Maryland, New Jersey, New Hampshire and Massachusetts do quite well (states toward the middle or left that appear well above the line).

Needless to say, if we were to simply start with these graphs and ask ourselves, whose kickin’ butt on NAEP gains… and are states with higher grades on Students First policy preferences systematically more likely to be kickin’ butt, the answers might not be so obvious. But if we start with the assumption that DC and TN are kicking butt and have the preferred policies, and then just ignore all of the others, we can construct a pretty neat – but completely invalid story line.

 Figure 12


And those, my friends, are the facts!



  1. A few year’s ago we went to China with Virginia Tech’s Summer Educational Workshop. We found out that in what we would call 8th grade all students take an exam to sort out who will go to high school. The students that did not meet the cutoff on the exam were given a choice. Parents would have to pay for them to attend high school at a rate equal to that of the annual salary of a pharmacist or attend a work related training. Hence, one would see that there was a strong motivation for parents and students to excel on this competitive exam. Here there is no sorting out like that in many countries. When a K-8 colleague principal asked about special education students we were told that there were none. When pushed on the topic we were told that they were not included in general education programs or data.
    Additionally, States like Mass., NJ., NY, Conn. all with strong unions appear to have the strongest scores and multiple measures of success. So why all the name calling and blaming? States that appear to have the larger number of National Board Certified teachers don’t have the stats that Mass, NJ or Conn have. They do have larger number of economically disadvantaged schools and/or students. What does that tell you? I would like to see all that real data expressed visually.

  2. Enough with the correlations!!! I appreciate your use of data in your blog posts and your coverage of equity (a topic that doesn’t receive nearly as much coverage as it should), but, please, for the sake of your readers who have not taken Stats 101, acknowledge that the slope of a simple regression equation should not be interpreted as the causal relationship between two variables.

    I completely support and appreciate you highlighting the ridiculous and fact-ignorant posts, articles, etc. that get posted about education policy, but please be more transparent about the drawbacks of your own logic/math.

    1. You continue to miss the point entirely that these scatterplots and simple regression are not intended to be anything other than descriptions of patterns within the data. At no point am I inferring causation from these scatterplots. But rather, strong associations in some cases, weaker in others…. but in all cases DESCRIPTIVE of patterns… patterns based on data points collected on underlying processes. The problem here is that others are presenting entirely data free arguments that are so easily refuted even with simple descriptives, be they in one or two dimensions. I spend a great deal of time teaching my own students about these very limitations, but also that exploratory, descriptive analysis is a useful way to develop understanding of patterns in data and patterns of relationships among various measures derived from complex systems. So, I urge you to get off your high horse of presuming you know better what you’re talking about… and take the time to read more carefully what I am providing in these posts.

      1. Descriptive analysis is incredibly useful, and nothing in my above comment is meant to suggest otherwise. Nor is the above comment meant to suggest I know more than you about anything, because that is simply not true. But data-free arguments are NOT always refuted with simple descriptive analysis, and I’ve rarely seen any acknowledgement of this since I’ve started reading your blog.

        Take your NAEP/Teacher Eval argument. If I’m able to understand correctly from all the way up on my high horse, you say that teacher eval reform is not the cause of the growth seen in TN and DC because other states that implemented similar policies did not show growth, and other states that demonstrated growth did not implement teacher eval reform. This evidence does nothing to disprove the claim that, in DC or TN, teacher eval reform did have a positive effect on student growth. Given the unique policy context and implementation in each state it’s not surprising to see different effects of a similar policy in different states.

        A policy that has a positive effect in one state can have a completely different effect in another state and states can achieve the same outcomes with very different policies.

        As you said, I may be missing the point entirely, but it seems to me like you are not presenting the data to simply illustrate patters and relationships, but instead to disprove arguments of others. It would be a service to your readers for you to acknowledge the drawbacks to your own presented analysis, especially when so many of your readers seem to be interpreting your results as causal given the comments that are left on your blog.

      2. Again you misinterpret and misrepresent, with my NAEP example being a perfect example. I am NOT saying that my scatterplots or descriptives prove that teacher evaluation doesn’t matter (same with the NAEP poverty discussion in my Petrilli response). Rather, I said quite explicitly that for DC and TN gains between 2011 and 2013 to validate that their reforms do matter requires meeting two conditions:

        First, that the gains in NAEP scores have actually occurred – changed their trajectory substantively – SINCE implementation of these reforms.

        Second, that the gains achieved by states implementing these policies are substantively different from the gains of states not implementing similar policies, all else equal.

        As I show in the DC time series, gains were faster even before the policies were implemented. That rather simple fact (albeit contingent on samples, artificial scales, etc.) should raise some concerns about the assertion that more recent policies are causing the gains? Further, descriptively in two dimensions, gains were no greater on average in states/jurisdictions implementing Students First favored policies than in jurisdictions not doing so. In fact, part of the point there is precisely what you raise above – that many policies and conditions are interacting all at once, and suggesting that any one is cause for great, short term gains, is absurd. I do not offer the scatterplots in any way as proof that teacher evaluation policies “don’t work” (though I’m certainly a skeptic), but rather as further evidence that the above requirements simply aren’t met – really basic conditions for validating THEIR argument.

        As a simple point of logic, I need not prove the opposite – prove that teacher evaluation doesn’t matter – in order to prove that their evidence that it does is junk.

        The point here is through very basic presentations of information to show just how flimsy some of these arguments are. Do I like spending my time and effort this way? Hell no. It irks me to no end to have to spend so much time slamming down stupid with the simple stick. But the rate of production of completely stupid arguments seems to escalate as every day passes, sadly, leaving me less and less time to do the more important and useful, more nuanced and informative analyses. I especially had not planned to waste time yesterday on that post, but it could not be ignored. That said, I hope today to get back to something far more substantive.

Comments are closed.