More on NAEP Poverty Gaps & Why State Comparisons Don’t Work

This post is a follow-up to a recent post on how income distributions differ across states and how those income distributions thwart our ability to make reasonable comparisons across states in the size of achievement gaps in relation to low-income status. This series of posts on NAEP poverty gaps comes in response to a tweet on May 4 from Lisa Fleisher of the WSJ.  Lisa was quoting NJ Education Commissioner Cerf on NJ school performance.

  • @lisafleisher Lisa Fleisher
  • Cerf on performance of NJ schools compared w/nation: 5th best in country. But gap btwn rich/poor = 47th highest gap. An “astounding figure”

Cerf has had some difficulties in the past making reasonable (honest) presentations of achievement data – specifically with respect to the influence of poverty measurement.

To review (so you don’t have to necessarily go back and read the other post, which is here):

Here’s the basic framing adopted by most who report on this stuff:

Non-Poor Child Test Score – Poor Child Test Score = Poverty Achievement Gap

Non-Poor Child in State A = Non-Poor Child in State B

Poor Child in State A = Poor Child in State B

These conditions have to be met for there to be any validity to rankings of achievement gaps.

Now, here’s the problem.

Poor = child from family falling below 185% income level relative to income cut point for poverty

Therefore, the measurement of an achievement gap between “poor” and “non-poor” is:

Average NAEP of children above 185% poverty threshold – Average NAEP of children below 185% poverty threshold = “Poverty” achievement Gap

But, the income level for poverty is not varied by state or region. See:

As a result, the distribution of children and their families above and below the specified threshold varies widely from state to state, and comparing the average performance of the groups of children above that threshold and below it is not particularly meaningful.  Comparing those gaps across states is really problematic.

While I showed how different the poverty and income distributions were in Texas and New Jersey as an example, I didn’t necessarily go far enough in that post to explain how/why these distribution differences thwart comparisons of low-income vs. non-low income achievement gaps. Yes, it should be clear enough that the above the line and below the line groups just aren’t similar across these two states and/or nearly every other.

A logical extension of the analysis in that previous post would be to look at the relationship between:

Gap in average family total income between those above and below the free or reduced price lunch cut-off


Gap in average NAEP scores between children from families above and below the free or reduced price lunch cut-off

If there is much of a relationship between the income gaps and the NAEP gaps – that is, states with larger income gaps between the poor and non-poor groups also have larger achievement gaps – such a finding would call into question the usefulness of state comparisons of these gaps.

So, let’s walk through this step by step.

First, here is the relationship across states between the  NAEP Math Grade 8 scores and family total income levels for children in families ABOVE the free or reduced cutoff:

There is a modest relationship between income levels of non-low income children and NAEP scores. Higher income states generally have higher NAEP scores. No adjustments are applied in this analysis to the value of income from one location to another, mainly because no adjustments are applied in the setting of the poverty thresholds. Therein lies at least some of the problem. The rest lies in using a simple ABOVE vs. BELOW a single cut point approach.

Second, here’s the relationship between the average income of families below the free or reduced lunch cut point and the average NAEP scores on 8th Grade Math (2009).

This relationship is somewhat looser than the previous relationship and for logical reasons – mainly that we have applied a single low-income threshold to every state and the average income of individuals below that single income threshold does not vary as widely across states as the average income of individuals above that threshold. Further, the income threshold is arbitrary and not sensitive do the differences in the value of any given income level across states.  But still, there is some variation, with some stats have much larger clusters of very low-income families below the free or reduced price lunch threshold (Mississippi).


This graph shows the relationship between income gaps estimated using the American Community Survey data ( from 2005 to 2009 and NAEP Gaps. This graph addresses directly the question posed above – whether states with larger gaps in income between families above and below the arbitrary low-income threshold also have larger gaps in NAEP scores between children from families above and below the arbitrary threshold.

In fact, they do. And this relationship is stronger than either of the two previous relationships. As a result, it is somewhat foolish to try to make any comparisons between achievement gaps in states like Connecticut, New Jersey and Massachusetts versus states like South Dakota, Idaho or Wyoming. It is, for example, more reasonable to compare New Jersey and Massachusetts to Connecticut, but even then, other factors may complicate the analysis.

3 thoughts on “More on NAEP Poverty Gaps & Why State Comparisons Don’t Work

  1. The poverty line number is ridiculous, based on 3x the cost of a 1960 basket of groceries and scaled each year by the CPI. It has nothing to do with the actual cost of living in any state, and the fact that the number is the same in San Francisco as Fargo, North Dakota just adds to the sillyness.

    1. And the key issue here is that because income groups either side of this arbitrary line are so different in each state, all of those comparisons we’d like to be able to make – and comparisons that pundits do make and often quite loudly – aren’t very meaningful. From a statistics perspective taking information like an income distribution which contains a lot of interesting information, and reducing that information into a simple dichotomy of Poor or Not Poor, removes a ton of useful information. It removes most of the useful information that may have been there.

Comments are closed.