Andy Rotherham over at Eduwonk posted an Irony Alert yesterday as many media outlets poised themselves to start “outing” ineffective teachers by posting publicly those teacher’s value-added effectiveness scores. Rotherham argued:
In light of this blow up about value-added in New York City, in a lot of places if the teachers unions would actually get serious about actually using value-add data as part of teacher evaluations it could be shielded from “Freedom of Information”requests that identify teachers, just as many aspects of personnel evaluations are. They’re caught in their own mousetrap here. My take on the larger issue from a few weeks ago and LA.
I thought…. hmmm… really? That doesn’t seem right. Is this just a clever argument intended to dupe teachers into getting those scores into their evaluations on some false assumption that the information would then be protected? Are these issues even transferable from state to state? Is the raw data used for generating the teacher effectiveness ratings actually considered part of the personnel file? I’m somewhat of an amateur on this school law stuff, but have enough background to start asking these questions when such arguments are tossed out there. So I did. I asked a handful of legal scholars in education policy, each of whom deals regularly with legal questions over personnel records under state law and with student record information.
This is good stuff, and the very kind of conversation we should be having when such questions are raised. Ask the experts. Much of the argument hinges on when the raw data is translated into a measure that actually becomes part of the personnel file (at least with regard to the “shield” issue posed by Rotherham). Here’s Justin Bathon’s summary:
Anyway, summarizing, I think the raw data is generally going to be made publicly open following FOIA requests. I think New York City is currently correct in their assessment that no exemption exists under New York’s Freedom of Information Law. However, this is just my analysis after considering this issue for a single day and I want to caution against over reliance on my initial assumptions. A thorough analysis needs to be conducted of all 50 state policies, interpreting regulations, attorney general opinions, and previous case-law. Further, data experts such as Bruce must assist the analysis with a complete understanding of each state’s dataset and the possible links to both teachers and their evaluations within the datasets. Thus, there is still a lot of work left to be done.
This is a legal frontier (another one of those enabled by technology) that most legislatures would not have contemplated as possible in enacting their open records laws. Thus, it is a great topic for us to debate further to inform future policy actions on open records personnel evaluation exemptions.
Please, read the rest of his well thought out, albeit preliminary post.
Here are my follow-up comments (cross-posted at edjurist) on data/data structures and their link to teacher evaluations:
Here are some data scenarios:
A. The district has individual student test score data that are linkable to individual teachers but the district doesn’t use those data to generate any estimates of individual teacher “effectiveness,” has not adopted any statistical method for doing so and therefore does not include any such estimates as part of personnel records. The individual students’ identity can be masked but with matched ID over time and specific characteristics attached (race, low-income status)
B. The district has individual student test score data that are linkable to individual teachers just as above, and the district does a) have an adopted statistical model/method for generating teacher value added “effectiveness” scores, but uses those estimates only for district level evaluation/analysis and not for individual teacher evaluation.
C. The district has individual student test score data that are linkable to individual teachers as above, and a) the district has an adopted statistical method/model for generating teacher value-added “effectiveness” scores and has negotiated a contractual agreement with teachers (or is operating under a state policy framework) which requires inclusion of the “effectiveness” scores in the formal evaluation of the teacher.
Under option C above, sufficient technical documentation should be available such that “effectiveness” estimates could be checked/replicated/audited by an outside source. That is, while there should be materials that provide sufficiently understandable explanations such that teachers can understand their own evaluations and extent to which their “effectiveness” ratings are, or are not under their own control, there should also be a detailed explanation of the exact variables used in the model, the scaling of those variables, etc. and the specification of the regression equation that is used to estimate teacher effects. There should be sufficient detail to replicate district generated teacher effectiveness scores.
That aside, a few different scenarios arise.
1. The LA Times scenario, as I understand it, falls under the first conditions above. The data existed in raw form. The district was not using those data for “effectiveness” rating. The LAT got the data and handed them over to Richard Buddin of RAND. Buddin then estimated the most reasonable regression equation he could with the available data and, for that matter, produced a sufficiently detailed technical report – such that anyone accessing the same data could replicate his findings. I suspect that individual student names were masked, but the students were clearly matched to identifiable teachers, and student data included specific identifiers of race, poverty, etc. and participation in programs such as gifted programs (indicator on child that he/she labeled as gifted). Not sure what, if any, issues are raised by detailed descriptive information on child level data. In this case, the data requested by LAT and handed over to Buddin were not linked to teacher evaluation by the district itself, in any way, as I understand it.
2. As I understood the recent NYC media flap, the city itself was looking to report/release the value-added ratings and the city itself also intends to use those “value added” ratings for personnel evaluation. It sounded to me that Charleston, SC was proposing roughly the same. Each teacher would have a nifty little report card showing his her “relative” effectiveness rating compared to other teachers. This effectiveness rating is essentially a “category” labeling a teacher as “better than average” or “worse than average.” These categories are derived from more specific “estimates” which come from a statistical model, which generates a coefficient for each teacher’s “effect” on the students who have passed through that teacher’s classroom (these coefficients having substantial uncertainty and embedded bias which I have discussed previously… but that’s not the point here). So, the effectiveness profile of the teacher is an aggregation of these “effects” into larger categories – but is nonetheless directly drawn from these effect estimates generate by the district itself for teacher evaluation purposes (even if subcontracted by the district to a statistician). I would expect that the specific estimate and the profile aggregation would be part of the teacher’s personnel record.
So, now that the city’s official release of effectiveness profiles is on hold, what if a local newspaper requested a) the raw student data linkable to teachers, with student names masked but with sufficient demographic detail on each student and with identifiable information on teachers, and b) the detailed technical documentation on the statistical model and specific variables used in that model? The newspaper could then contract a competent statistician to generate his/her own estimates of teacher effectiveness using the same data used by the district and the same method. These would not be “official” effectiveness estimates, nor could the media outlet claim them to be. But they would be a best attempt at a replication. Heck, it might be more fun if they used a slightly different model, because the ratings might end up substantially different from the district’s own estimates. Replicating or not, the districts own methods, and producing roughly the same or very different ratings for teachers, these estimates would still not be the official ones. Given the noise and variation in such estimates at the teacher level, it might actually be pretty hard to get estimates that correlate substantially with the district’s own estimates – and one would never know, because the district official effectiveness estimates for teachers would still be private.
I would assume under these circumstances, partly because the “official” personnel file estimates would remain unknown, and because it’s highly probable that the independent estimates produced by the media outlet – even if trying to replicate district estimates – might vary wildly from the district estimates – that the media outlet could get the data, estimate the model and report their results – their unofficial results. On the one hand, the media outlet could rely on the uncertainty of the estimates to justify that what they produce should not be considered “official” estimates. And on the other hand… in bold print in the paper… they could argue as the LA Times Jasons have … that these estimates are good and reliable estimates of actual teacher effectiveness!
The conversation continues over at EdJurist: http://www.edjurist.com/blog/value-added-evaluation-data-and-foia-state-versions-that-is.html?lastPage=true#comment10260419
Interesting follow-up point from Scott Bauries over at Ed Juris:
Thus, from the legal perspective, I am left with one question: if the data and conclusions are being used as reflected in option “c,” but the media only gets the conclusions and not the raw data, then does the law allow a teacher to protect his or her reputation from unfair damage due to the publishing of a conclusion based on a noisy equation?
This is a very complicated question, involving both defamation law and the First Amendment. For example, is a public school teacher a “public official” or “public figure” for First Amendment purposes, such that the standard for proving defamation per se is increased? If so, then is the relevant statistical analysis illustrating the noisy nature of the conclusion enough to show falsehood for the purposes of a defamation claim? I think probably not in both instances, but I don’t think this precise issue has ever come up.