There’s much talk in education research about Randomized Control Trials and truly “experimental” research being the “gold standard” for determining whether a specific intervention “works” or not. Thus is the basis for the Institute for Education Sciences What Works Clearing House. It is often argued that randomized, or experimental studies are “good” and decisive, and that other approaches simply don’t match up. Therefore, if someone really wants to know what works or doesn’t with regard to a specific intervention or set of interventions, one need only review those randomized, experimental studies to identify the consensus finding.
There’s so much to discuss on these issues, including the extent to which truly randomized experiments can actually shed light on how interventions might play out in other settings or at scale. But I’ll stick to a much narrower focus in this post, and that is, just how randomized is randomized? Most recently, this question came to mind after reading this post addressing “experimental” vs. “non-experimental” studies of charter schools by Matt Di______Carlo at Shanker blog, and this post over at Jay P. Greene’s blog on RIGOROUS charter research (meaning experimental, or randomized).
There tend to be two types of studies done to determine the relative effectiveness of “charter schools” versus traditional “district schools.” The basic idea of either type of study is to determine the effect that “charter schooling” or some specific set of policies/practices and instructional models and strategies about “charter schooling”, has on students’ outcomes, when compared to kids who don’t receive those strategies. That is, exposure to “charter schooling” is assumed to be a treatment, and non-exposure, whatever that constitutes, is the control.
One type of study tries to identify after the fact, otherwise similar kids (matched pairs) attending a set of charter schools and a set of district schools in the same city, and then compares their achievement growth over time. These studies often fall short in two important ways.
- First, the measures used to determine which kids are comparable, or paired, are often too crude to ensure that kids really are similar (using broad classifications of income status, or disability status), creating substantial risk that they are not.
- Second, while kids in charter and district schools are matched with one another on these crude characteristics, the settings in which they are schooled, including the mix of peers with which they attend school may be dramatically different.
The other type of study is often referred to as meeting the gold standard – as being a randomized study – or lottery-based study. It is assumed, since these studies are declared golden, that they therefore necessarily resolve both above concerns. And it is possible, that if these studies truly were randomized (or even could be) that they could resolve the above concerns. But they don’t (resolve these concerns), because they aren’t (really randomized).
First, what would a randomized study look like? Well, it would have to look something like this – where we randomly take a group of kids – with consent or even against their will – and assign them to either the charter or traditional school option. The mix of kids in each group is truly random and checked to ensure that the two groups are statistically representative (using better than the usual measures) of the population. Then, we have to make sure that all other “non-treatment” factors are equivalent, including access to facilities, resources, etc. That is, anything that we don’t consider to be a feature of the treatment itself. This is especially important if we want to know whether expanding elements of the treatment are likely to work for a representative population. This is a randomized, controlled trial.
So then, what’s randomized in a randomized charter school study? Or lottery-based study? One might sketch out a lottery-based study as follows:
Here, the study is really only randomized at one point in a long complicated sequence – the lottery itself. Students and families have to decide they want to enter the lottery – that they are interested in attending a charter school, which will ultimately affect the composition of the charter school enrollments. Then, among those selecting into the pool, students are randomly chosen to attend the charters along side others randomly chosen to attend (from a non-random pool of lottery participants), and the others randomly selected, to go, well, somewhere else… with a group of peers non-randomly chosen to end up in that same somewhere else.
So, while the studies compare the achievement of kids randomly chosen to those randomly un-chosen (thus comparing only those who tried to get a charter slot), the kids are shuffled into settings that are anything but randomly assigned, containing potentially vastly different peer groups and a variety of other differences in setting. Add to this the likelihood of non-random student attrition, further altering peer group over time.
As such, I very much prefer these studies to be referred to as “lottery-based” rather than randomized or experimental. These studies are randomized at only one step in this process, potentially conflating setting/peer effects with treatment effects, thus substantially compromising policy implications.
As with those matching studies, the types of variables used to check and/or correct for peer composition and non-randomness of attrition are often too imprecise to be useful.
One fun alternative would be to pull a switch, whereby the charter teachers, their model, instructional strategies etc. would be traded with the district schools’ teachers, model and strategies, as a confirmatory test to see whether the charter model effects are actually transferable (assuming there were effects to begin with).
Clearly, I’m asking way too much to assume that charter school, or most other program/intervention research in education be based on real RCTs. That’s not going to happen. And I’m not convinced it would be that useful for informing policy anyway. But, my point in this post is to make it clear that the difference between the types of matched student studies done by CREDO, for example, and the studies being (mis)characterized as “gold standard” randomized studies is far more subtle than many are willing to admit and NEITHER ARE WHAT THEY’RE REALLY CRACKED UP TO BE!