Revisiting the Chetty, Rockoff & Friedman Molehill

My kids and I don’t watch enough Phineas and Ferb anymore. Awesome show. I was reminded just yesterday of this great device!

320px-Mountain_out_of_molehill-inatorThis… is the Mountain-Out-Of-A-Molehill-INATOR!  The name is rather self-explanatory – but here’s the official explanation anyway:

The Mountain-out-of-a-molehill-inator turns molehills into big mountains. It uses energy pellets to do so. It was created because all his life he was told “Don’t make mountains out of molehills”.

Now, I don’t mean to belittle the famed Chetty, Rockoff and Friedman study from a while back, which was quite the hit among policy wonks. As I explained in both my first, and second posts on this study, it’s a heck of a study, with lots of interesting stuff… and one hell of a data set!

What irked me then, and has all along is the spin that was put on the study, and that the spin was not just a matter of interpretation by politicos and the media, but that the spin was being fed by the study’s authors.

I figured that would eventually die down. I figured eventually cooler heads would prevail. But alas, I was wrong.  Worst of all, we still have at least some of the study’s authors prancing around like Doofenschmirtz (pictured above) with their very own Mountain-out-of-a-molehill-inator!

So what the heck am I talking about? This! is what I’m talking about. This graph provides the basis for the oft-repeated claim that having a good teacher generates $266k in additional income for a classroom full of kids over their lifetime. $266k – that’s a heck of a lot of money! We must get all kids in classrooms with these amazing teachers!


This graph comes from a presentation given the other day to the New Jersey State Board of Education, in an effort to urge them to continue moving forward using Student Growth Percentiles as a substantial share of high stakes teacher evaluation (yes… to be used in part for dismissing the “bad” teachers, and retaining the “good” ones).

This graph shows us that the $266k figure actually comes from a figure of about $250! CHECK OUT THE VERTICAL AXIS ON THIS GRAPH! First of all, the authors chose to graph only one age (28) at which there even was a statistically significant difference in the earnings of children with super awesome versus only average teachers!  The full range on the vertical axis GOES ONLY FROM $20,400 TO $21,200! And the trendline goes from $20,600 to $21,200 – for a total vertical range of about $600! Yeah… that’s a molehill… about 2.9%.  The difference from the top to the average (albeit amidst a rather uncertain scatter) is only about $250. Now, the authors wouldn’t have generated quite the same buzz by pointing out that they found a wage differential of this magnitude – statistically significant or not- in a data set of this magnitude.

Here’s further explanation of their Mountain-out-of-a-molehill-inator calculation:


That’s right… just point the Mountain-Out-Of-A-Molehill-Inator at the graph above, and all of the sudden that rather small differential that occurs at one age (displayed as a huge effect by spreading the heck out of the Y axis) all of the sudden becomes $266k.

Heck, why not multiply times a whole freakin’ village! Or why not the entire enrollment of NYC schools (context for the study). What if every kid in NYC for 10 straight years had awesome rather than sucky teachers? How much more would they earn over a lifetime?

I was somewhat forgiving of this playful spin the first time around, when they first released the paper. These are the kind of things authors do to playfully explain the magnitude of their results.  It’s one thing when this occurs as playful explanation in an academic context. It’s yet another when this is presented as a serious policy consideration to naive state policymakers – a result that somehow might plausibly occur if those policymakers move boldly forward in adopting a substantively different measure of teacher effectiveness to be used for firing all of the bad teachers.

What really are the implications of this study for practice – for human resource policy in local public (or private schools)? Well, not much! A study like this can be used to guide simulations of what might theoretically happen if we had 10,000 teachers, and were able to identify, with slightly better than even odds, the “really good” teachers – keep them, and fire the rest (knowing that we have high odds that we are wrongly firing many good teachers… but accepting this fact on the basis that we are at least slightly more likely to be right than wrong in identifying future higher vs. lower value added producers). As I noted on my previous post, this type of big data – this type of small margin-of-difference finding in big data – really  isn’t helpful for making determinations about individual teachers in the real world. Yeah… works great in big-data simulations based on big-data findings, but that’s about it.

Indeed it’s an interesting study, but to suggest that this study has important immediate implications for school and district level human resource management is not only naive, but reckless and irresponsible and must stop.

8 thoughts on “Revisiting the Chetty, Rockoff & Friedman Molehill

  1. Really enjoyed this article and actually understood the graph!

    Do you think the explanation is pure deviousness? Everyone knows what they are doing, Chetty et al and the politicians. Having a “study” is merely a pretext, a CYA to fall back on as explanation for whatever it is they want to do? Thats the way polling is used. Whatever the objective is, once its accomplished theres no going back even if it was based on a lie. (Iraq War, for example)

  2. The policy makers that use that kind of data to implement change must have missed out on basic stat classes or they did and—Perhaps someone is selling tests, & curriculum materials and needed to invent a market for sales. I wonder how they measured the teacher performance over time to predict the student financial growth to get those figures?

  3. So that presentation was given by one of the authors? (The link is broken on that “presentation given to NJ…”).
    If so, it just boggles my mind that such a graph would be acceptable. Truncated and compressed Y-axes are a big pet peeve of mine.
    It is amazing how the controversy around the study at the time just didn’t stick as it now gets carried into the policy arena.
    I would propose multiplying the effect by the entire school age population, and then adding that to GDP. See, good kindergarten teachers can solve our deficit. No need to bicker over the sequester, or inconsequential government programs such as the Department of Defense, or Social Security, just fire a few hundred thousand teachers!

    1. I have less problem with these slides & related materials when presented as playful academic simulation, among academics who might reasonably understand/critique what the author is doing. But I have a really big problem with these being presented as serious policy implications to policymakers in a context where this is clearly about supporting a specific political agenda.

      The other part of this presentation that was twisted/disturbing was Rockoff’s spin on Kirabo Jackson’s study in which he found that when, post-deseg, kids were sorted back into schools by race in North Carolina, there was a concurrent resorting of teachers by prior value added scores that disadvantaged newly, predominantly black schools. Rockoff is now using this finding to suggest that any time we have previously seen what appears to be race/poverty related bias in VAMs that it’s not really bias, but rather a true sorting of teacher quality – 100% real sorting, 0% bias, as he presented it to NJBOE. He was basically arguing that SGP is fine, covariates don’t matter… in fact, implying that including covariates is wrong (because it would wipe away part of this real effect of teacher sorting by treating it as omitted variables bias). In the best case, we simply can’t know what portion is sorting based on prior value added and what portion is peer effect and/or other forms of OMV.

      But for Rockoff to suggest that Kirabo’s study proves undeniably that it’s all real quality sorting (thus prior scores, SGP or most basic VAM is perhaps the best option?) and should be treated as such in teacher evaluation/dismissal decisions is a twisted, irresponsible, and reckless leap of logic.

  4. Hi Bruce,

    The University of Phoenix would appreciate permission to link to one or more pages within your blog, School Finance 101. The links will be provided to students, and may point to specific pages within your site that we feel may be useful for certain classes. The content will not be copied. Although your terms of service seem to allow linking, we prefer to seek permission.

    Does the University of Phoenix have permission to link to pages in

    Thank you very much for your time!

    The University of Phoenix is a for-profit university, accredited by the Higher Learning Commission, and is a member of the North Central Association.

Comments are closed.