Teacher Evaluation with Value Added Measures

Posted on November 7, 2009



This month, the special issue of the journal Education Finance and Policy on value-added measurement of student outcomes was published. The table of contents is here:

http://www.mitpressjournals.org/toc/edfp/4/4

This is good stuff, authored by leading educational measurement and statistics researchers and economists. These articles provide some important cautionary tales regarding the application of value-added measures of student outcomes for teacher evaluation. Here is a policy brief with a more user friendly summary of some of the content of the special issue:

http://www.wcer.wisc.edu/publications/highlights/v19n3.pdf

Here’s a recent working paper by Jesse Rothstein, Princeton economist who also has an article in the special issue:

http://gsppi.berkeley.edu/faculty/jrothstein/published/rothstein_vam2.pdf

Here’s the concluding sentence of the abstract Rothstein’s paper:

Results indicate that even the best feasible value added models may be substantially biased, with the magnitude of the bias depending on the amount of information available for use in classroom assignments.

On average, the articles in the special issue do show some promise for using value-added assessment in teacher evaluation, with a number of really important caveats and technical stipulations.

Yes, we need access to more student assessment data with linkages to specific teachers – including the range of teachers across which middle and secondary students interact (it’s not as simple as linking the single teacher to a group of children). We need access to such data across multiple states and their assessment systems. Scaling properties of data and test noise play a major role in the precision with which one can isolate teacher or classroom level effects. We have little or no idea, for example, of the extent to which analyses using North Carolina or Texas assessment data relate to New Jersey assessment data, the statistical properties of those data and their usefulness or lack thereof for estimating teacher or classroom effects (unless there are technical papers out there on NJ tests of which I am unaware).

So, these are the main reasons we need to tear down firewalls – to advance the art, science and statistics of value added modeling, school and teacher evaluation and to uncover potential shortcomings where they exist.

Policymakers and pundits diving in head first on these issues need, quite simply, to chill out, perhaps read the special issue above and heed the advice earlier this year from the National Academy of Sciences and figure out how to do this right if we’re going to do it at all.

Diving in too quickly and doing it wrong will make it that much harder to do it right in the long run and will provide that much more ammunition for resistance.