I had a twitter argument the other day about a blog posting that compared the current debate around “de-selection” of bad teachers to eugenics. It is perhaps a bit harsh to compare Hanushek (cited author of papers on de-selecting bad teachers) to Hitler, if that was indeed the intent. However, I did not take that as the intent of the posting by Cedar Riener. Offensive or not, I felt that the blog posting made 3 key points about errors of reasoning that apply to both eugenecists and to those promoting empirical de-selection of fixed shares of the teacher workforce. Here’s a quick summary of those three points:
- The first error is a deterministic view of a complex and uncertain process.
- The second common error becomes apparent once the need arises to concretely measure quality.
- The third error is a belief that important traits are fixed rather than changeable.
These are critically important, and help us to delineate between smart selection and, well, dumb selection. These three errors of reasoning are the basis for dumb selection. A selection that is, as the author explains, destined to fail. But, I do not see this particular condemnation of dumb selection to be a condemnation of selection more generally. By contrast, the reformy pundit with whom I was arguing continued to claim that Riener’s blog was condemning any and all forms of selection as doomed to fail, a seemingly absurd proposition (and not how I read it at all).
Clearly, selection can and should play a positive role in the formation of the teacher workforce or in the formation of that team of school personnel that can make a school great.
Smart Selection: In nearly every human endeavor, in every and any workforce or labor market activity exists some form of selection. Selection of individuals into specific careers, jobs and roles and de-selection of individuals out of careers, jobs and roles. Selection in and of itself is clearly not a bad thing. In fact, the best of organizations necessarily select the best available individuals over time to work within those organizations. And, individuals attempt to select the best organizations, careers, jobs and roles to suit their interests, motivation and needs. That is, self-selection. Teacher selection or any education system employee selection is no different. And good teacher selection is obviously important for having good schools. Like any selection process on the labor market, teacher selection involves a two-sided match. On the one hand, there are the school leaders and existing employees (to the extent they play a role in recruitment and selection) who may play a role in determining among a pool of applicants which ones are the best fit for their school and the specific job in question. On the other hand, there are the signals sent out by the school (some within and some outside the control of existing staff and leaders) which influence the composition of the applicant pool and for that matter, whether an individual who is selected decides to stay. These include signals about compensation, job characteristics and work environment. Managing this complex system well is key to having a great school. Sending the right signals. Creating the right environment. Making the right choices among applicants. Knowing when a choice was wrong. And handling difficult decisions with integrity.
There has also been much discussion of late about a recent publication by Brian Jacob of the University of Michigan, who found that when given the opportunity to play a strong role in selecting which probationary teachers should continue in their schools, principals generally selected teachers who later proved to generate good statistical outcomes (test scores). Note that this approach to declaring successful decision making suffers the circular logic I’ve frequently bemoaned on this blog. But, at the very least, Jacob’s findings suggest that decisions made by individuals – human beings considering multiple factors – are not counterproductive when measured against our current batch of narrow and noisy metrics. Specifically, Jacob found:
Principals are more likely to dismiss teachers who are frequently absent and who have previously received poor evaluations. They dismiss elementary school teachers who are less effective in raising student achievement. Principals are also less likely to dismiss teachers who attended competitive undergraduate colleges. It is interesting to note that dismissed teachers who were subsequently hired by a different school are much more likely than other first-year teachers in their new school to be dismissed again.
That to me seems like good selection. And it seems that principals are doing it reasonably well when given the chance. And this is why I also support using principals as the key leverage point in the process (with the caveat that principal quality itself is very unequally distributed, and must be improved).
Dumb “Selection:” Dumb selection on the other hand – the kind of selection that is destined to fail if applied en masse in public schooling or any endeavor suffers the three major flaws of reasoning addressed by Cedar Riener in his blog post. Now, you say to yourself, but who is really promoting dumb selection and what more specifically are the elements of dumb selection when it comes to the teacher workforce? Here are the elements:
- Heavy (especially a defined fixed, large share) weight in making teacher evaluation, compensation or dismissal decisions placed on Value-Added metrics, which can be corrupted, may suffer severe statistical bias, and are highly noisy and error prone.
- Explicit, prior specification of the exact share of teachers who should be de-selected in any given year, or year after year over time OR prior specification of exact scores or ratings (categories) derived from those scores requiring action to be taken – including de-selection.
Sadly, several states have already adopted into policy the first of these dumb selection concepts – the mandate of a fixed weight to be place on problematic measures. See this post by Matt Di Carlo at ShankerBlog for more on this topic.
Thus far, I do not know of states or districts that have, for example, required that 5% of the bottom scoring teachers in any given year be de-selected. But, states and districts have established categorical rating systems for teachers from high to low rating groups, based arbitrary cut points applied to these noisy measures, and have required that dismissal, intervention and compensation decisions be based on where teachers fall in the fixed, arbitrary classification scheme in a given year, or sequence of three years.
To some extent, the notion of de-selecting fixed shares of the teacher workforce based on noisy metrics comes from economists simulations based on convenience of measures than on active policy conversations. But in the past year, the lines between these simulations and reality have become blurred as policy conversations have indeed drifted toward actually using fixed values based on noisy achievement measures in place of seniority as a blunt tool to deselect teachers during times of budget cuts. If and when these simplified social science thought exercises are applied as public policy involving teachers, they do reek of the disturbingly technocratic, “value-neutral” mindset pervasive in eugenics as well.
One other recent paper that’s gotten attention, applies this technocratic (my preference over eugenic) approach to determine whether using performance measures instead of seniority would result a) in different patterns of layoffs and b) in different average “effectiveness” scores (again, that circular logic rears its ugly head) Now, of course, if you lay off based on effectiveness scores rather than seniority, the average effectiveness scores of those left should be higher. The deck is stacked in this reformy analysis. But, even then, the authors find very small differences, largely because a) seniority based layoffs seem to be affecting mainly first and second year teachers, and b) effectiveness scores tend to be lower for first and second year teachers. Overall, the authors find:
We next examine our value-added measures of teacher effectiveness and find that teachers who received layoff notices were about 5 percent of a standard deviation less effective on average than the average teacher who did not receive a notice. This result is not surprising given that teachers who received layoff notices included many first and second-year teachers, and numerous studies show that, on average, effectiveness improves substantially over a teacher’s first few years of teaching.
Perhaps most importantly, these thought experiments, not ready for policy implementation prime time (nor will they ever be?) necessarily ignore the full complexity of the system to which they are applied, and as Riener noted, assume that individual’s traits are fixed – how you are rated by the statistical model today is assumed to correct (despite a huge chance it’s not) and assumed to be sufficient for classifying your usefulness as an employee, now and forever (be it a 1 or 3 year snapshot). In that sense, Riener’s comparison, while offensive to some, was right on target.
To summarize: Smart selection good. Dumb selection bad. Most importantly, selection itself is neither good nor bad. It all depends on how it’s done.