Negotiating Points for Teachers on Value-Added Evaluations

A short time back I posted an explanation of how using value-added student testing data could lead to a series of legal problems for school districts and states.  That post can be found here:

My concerns regarding legal issues arose from statistical problems and some practical problems associated with using value-added assessment to reliably and validly measure teacher effectiveness. The main issue is to protect against wrongly firing teachers on the basis of statistical noise, or on the basis of factors that influenced the value-added scores that were not related to teacher effectiveness.

Among other things, I pointed out problems associated with the non-random assignment of students, and how non-random assignment of students across classrooms of teachers can influence significantly – bias that is – value-added estimates of teacher effectiveness. Non-random assignment could, under certain state policies or district contracts, lead to the “de-tenuring” and/or dismissal of a teacher simply on the basis of students assigned to that teacher. Links to research and more detailed explanation of the non-random assignment problem are provided on the previous post above.

Of course, this also means that school principals or superintendents – anyone with sufficient authority to influence teacher and student assignment – could intentionally stack classes against the interest  of specific teachers. A principal could assign students to a teacher with the intent of harming that teacher’s value-added estimates.

To protect against this possibility, I suggest that teachers unions or individual teachers argue for language in their contracts which requires that students be randomly assigned and that class sizes be precisely the same – along with the time of day when courses are taught, lighting, room temperature , nutrition and any other possible factors that could compromise a teacher’s value added score and could be manipulated against a teacher.

The language in the class size/random assignment clause will have to be pretty precise to guarantee that each teacher is treated fairly – in a purely statistical sense. Teachers should negotiate for a system that guarantees “comparable class size across teachers – not to deviate more than X” and that year to year student assignment to classes should be managed through a “stratified randomized lottery system with independent auditors to oversee that system.” Stratified by disability classification, poverty status, language proficiency, neighborhood context, number of books in each child’s home setting, etc. That is, each class must be equally balanced with a randomly (lottery) selected set of children by each relevant classification.  This gets out of hand really fast.


I welcome suggestions for other clauses that should be included.

Just pondering the possibilities.
This is a thoughtful read from a general supporter of using VA assessments to create better incentives to improve teacher quality. Read the “Policy Uses” section on pages 3-4.



  1. Hello Bruce –
    I have great respect for your work and your blog, and appreciate the work that you do to bring more daylight to these areas of education. However, I have to disagree with your approach here. I’m not sure the limitations you mentioned in the prior blog are surmountable. You can try to guarantee random assignment and control for all of those factors, but at small schools, there will be too many factors and not enough students to even everything out. Then, at larger secondary schools, you would have to go a step further and build entire school schedules around that randomization – which won’t work well for students who take classes that can’t be distributed randomly all around the rest of the day. Secondary teachers can tell you that smaller classes determine a kid’s schedule. I can teach a basic ninth grade English class any period of the day, but the band kids are going to be clustered because band classes aren’t offered all day, and the remedial math kids (in some schools) are going to be clustered in certain English classes because their remedial math class is only offered at certain times of day. In a diverse public school, want to take a guess how the socio-economic and ethnic groupings are usually distributed among band and remedial math? So much for randomizing.

    Futhermore, I have argued at length in my blog and in article on Teacher Magazine that you cannot tie reading performance to any one teacher. Yes, the reading skills are described in my subject standards as an English teacher. My students also read text in every other class during the day, and at home, and some have tutors. How do you control for all of those influences? AND furthermore, the tests that I’m familiar with have too much garbage in them to make it worthwhile to negotiate what you suggest anyways.

    So, I’d say you were more on track with the prior blog. The legal challenges will not be sufficiently addressed in the ways you suggest, and I am actually surprised to see teachers anywhere negotiating for the inclusion of test scores in evaluation – unless – maybe – it’s as a factor in evaluating a school or department.

    See my blog for more on testing, evaluation, and why even leading thinkers in business wouldn’t go along with such schemes.

    1. You are absolutely right that the issues in my previous post cannot be overcome. And, I agree entirely with your point that you can’t tie reading performance to any one teacher. Actually, the point of this post was to illustrate just how absurd the whole exercise is. These issues can’t be resolved. And even if they could, you’d have to jump through completely absurd hoops to get there. Sadly, sarcasm doesn’t come through well in a blog post.

  2. Doh! Sorry I didn’t pick up on the sarcasm creeping in there. On second glance I get it. The perils of reading a blog on an iPhone and not paying closer attention…

  3. What?! Sarcasm in blogs?! I wonder if Steve Jobs is filtering sarcasm as well as perceived obscenity…

    I understand the temptation to put ridiculous conditions in contracts, but the matter is serious enough that there does need to be some thought as to crafting appropriate contract language, since teachers unions have to bargain all sorts of ridiculous things to make them work as best as possible (or with as little harm as possible). My local district (Hillsborough County, Fla.) has been experimenting using the state’s merit-pay structure in ways that you’d expect, with technical missteps every year in ways that rub a bunch of teachers the wrong way. Thus far, that’s only been at the level of bonus distribution and not people’s jobs (and the children who don’t get taught by the teachers the following year).

    My understanding is that there are really three places to intervene most effectively in contract language: in the procedures used at the front end, in the gatekeeping threshold for when someone’s job is at risk, and in appeals to save someone’s butt. The concept of “rebuttable presumption” is probably the one to use, but my bargaining experience is limited and only at the higher-ed level.

    1. On the one hand, my comments above are certainly sarcastic. But the reality here is that if the train has already left the station, and if performance data of this type will be used for either allocation of bonuses, salary increases or for dismissal, then teachers individually and collectively must bargain for technical provisions like those I note above in order to protect themselves from unfair treatment, intentional or not. My problem here is that the move toward these problematic systems actually does require negotiation of contractual provisions that seem quite absurd. Sadly, the absurd becomes necessary. And even then, these absurd provisions can’t fully resolve all of the technical and practical concerns, like those raised by David Cohen.

  4. This is a very good point.

    As a simply matter of equity for students and teachers, class sizes should be uniform across all schools. In NYC, class sizes in HS range from about 18 to 34 or more across schools; with the small schools and charters having significantly smaller classes.

    As for the teacher performance evaluations done in NYC, they supposedly do take class size into account in their formula, but the model is so obscure and non-transparent that it is difficult to see how. In addition, the reported class size data that they draw from, esp. in HS, is unreliable.

  5. Wow Bruce-
    thanks for the tongue in cheek take on how to make value added evaluations “fair” through contract negotiations. I am one of those teachers who counselors, special ed. IEP teachers, and the ELL department have realized will work with their students closely and with success. Of course many of these students still cannot pass the state mandated test, and I am worried that teachers like me who are making a difference and are effective teachers will get dinged by this new system being proposed.

