Mark Dynarski has added some additional useful recommendations regarding productivity research. Dynarski’s comments come in response to our suggestions for improving the rigor of productivity research, where our suggestions were based on rigorous application of relevant methods that we would expect to see applied in productivity research.
We agree with Mark Dynarski that using relevant methods alone doesn’t guarantee that they are used well. We were starting from the position that the work of Roza and Hill doesn’t not apply relevant methods at all, no less well. That in mind, we concur with Dynarski’s argument that it is not only important to use the right methods, but to use them well, and that reasonable standards may be applied. Here are Mark Dynarski’s suggestions:
Here are some examples of what I had in mind for research standards: the analysis has been replicated by another researcher working independently (replication being a lynchpin of scientific method). Predictions from the analysis have explanatory power outside the sample. The modeling framework is mathematically consistent. The research team has no conflicts of interest.
Applying these standards might result in excluding a lot of current research (even peer-reviewed research), but I think that would be the point Welner and Baker are making.
Readers interest in assessing research might take a look at the National Academy of Sciences’ Reference Manual on Scientific Evidence, now in its third edition, especially the chapter by Kaye and Freedman on statistics. It’s highly readable and available for free download from the academy’s website.
Below is my original reply to Mark Dynarski’s comment:
Over at Sara Mead’s Ed Week blog, Mark Dynarski checks in with a few relevant questions and observations. Actually, as it turns out, we agree ALMOST entirely with Dynarski when he says:
And focusing on peer-reviewed research as a form of quality assurance, as Baker and Welner suggest, seems problematic. Peer-reviewed research journals have highly variable degrees of editorial control, and peer review itself can vary from cursory reading to exhaustive and detailed comments. My own observation is that focusing on research with rigorous designs probably is a superior contributor to quality on average. There never seem to be enough of these when difficult debates on education policy issues arise, though.
Our only disagreement here is with his characterization of what we said. We did not uphold peer review as the gold standard. Though we probably used the phrase – peer review – too often in the brief itself. Rather, we believe just as Dynarski stated, that research with rigorous designs is a superior contributor to quality, on average! Hell yes. Absolutely. That’s our point. At the very least, the issues and questions at hand should be framed, or frame-able in relevant terms for rigorous evaluation.
That is precisely our concern with the Roza/Hill and Roza and other colleagues materials we address in our report (see pages 9 to 14). Further, a large section of our report summarizes the relevant methods – those rigorous and appropriate designs that should be applied to the questions at hand, but are noticeably absent even at the most cursory level in Roza and Hill’s materials.
To save you all the trouble of actually reading our entire brief, I’ve copied and pasted below the section of our brief where we address relevant methods:
Summary of Available Methods
Discussions of educational productivity can and should be grounded in the research knowledge base. Therefore, prior to discussing the Department of Education’s improving productivity project website and recommended resources, we think it important to explain the different approaches that researchers use to examine productivity and efficiency questions. Two general bodies of research methods have been widely used for addressing questions of improving educational efficiency. One broad area includes “cost effectiveness analysis” and “cost-benefit analysis.” The other includes two efficiency approaches: “production efficiency” and “cost efficiency.” Each of these is explained below.
Cost-Effectiveness Analysis and Cost-Benefit Analysis
In the early 1980s Hank Levin produced the seminal resource on applying cost effectiveness analysis in education (with a second edition in 2001, co-written with Patrick McEwan),[i] helpfully titled “Cost-Effectiveness Analysis: Methods and Applications.” The main value of this resource is as a methodological guide for determining which, among a set of options, are more and less cost effective, which produce greater cost-benefit, or which have greater cost-utility.
The two main types of analyses laid out in Levin and McEwan’s book are cost-effectiveness analysis and cost-benefit analysis, the latter of which can focus on either short-term cost savings or longer term economic benefits. All these approaches require an initial determination of the policy alternatives to be compared. Typically, the baseline alternative is the status quo. The status quo is not a necessarily a bad choice. One embarks on cost-effectiveness or cost-benefit analysis to determine whether one might be able to do better than the status quo, but it is not simply a given that anything one might do is better than what is currently being done. It is indeed almost always possible to spend more and get less with new strategies than with maintaining the current course.
Cost-effectiveness analysis compares policy options on the basis of total costs. More specifically, this approach compares the spending required under specific circumstances to fully implement and maintain each option, while also considering the effects of each option on a common set of measures. In short:
Cost of implementation and maintenance of option A
Estimated outcome effect of implementing and maintaining option A
Cost of implementation and maintenance of option B
Estimated outcome effect of implementing and maintaining option B
Multiple options may (and arguably should) be compared, but there must be at least two. Ultimately, the goal is to arrive at a cost-effectiveness index or ratio for each alternative in order to determine which provides the greatest effect for a constant level of spending.
The accuracy of cost-effectiveness analyses is contingent, in part, upon carefully considering all direct and indirect expenditures required for the implementation and maintenance of each option. Imagine, for example, program A, where the school incurs the expenses on all materials and supplies. Parents in program B, in contrast, are expected to incur those expenses. It would be inappropriate to compare the two programs without counting those materials and supplies as expenses for Program B. Yes, it is “cheaper” for the district to implement program A, but the effects of program B are contingent upon the parent expenditure.
Similarly, consider an attempt to examine the cost effectiveness of vouchers set at half the amount allotted to public schools per pupil. Assume, as is generally the case, that the measured outcomes are not significantly different for those students who are given the voucher. Finally, assume that the private school expenditures are the same as those for the comparison public schools, with the difference between the voucher amount and those expenditures being picked up through donations and through supplemental tuition charged to the voucher parents. One cannot claim greater “cost effectiveness” for voucher subsidies in this case, since another party is picking up the difference. One can still argue that this voucher policy is wise, but the argument cannot be one of cost effectiveness.
Note also that the expenditure required to implement program alternatives may vary widely depending on setting or location. Labor costs may vary widely, and availability of appropriately trained staff may also vary, as would the cost of building space and materials. If space requirements are much greater for one alternative, while personnel requirements are greater for the second, it is conceivable that the relative cost effectiveness of the two alternatives could flip when evaluated in urban versus rural settings. There are few one-size-fits-all answers.
Cost-effectiveness analysis also requires having common outcome measures across alternative programs. This is relatively straightforward when comparing educational programs geared toward specific reading or math skills. But policy alternatives rarely focus on precisely the same outcomes. As such, cost-effectiveness analysis may require additional consideration of which outcomes have greater value, which are more preferred than others. Levin and McEwan (2001) discuss these issues in terms of “cost-utility” analyses. For example, assume a cost-effectiveness analysis of two math programs, each of which focuses on two goals: conceptual understanding and more basic skills. Assume also that both require comparable levels of expenditure to implement and maintain and that both yield the same average combined scores of conceptual and basic-skills assessments. Program A, however, produces higher conceptual-understanding scores, while program B produces higher basic-skills scores. If school officials or state policy makers believe conceptual understanding to be more important, a weight might be assigned that favors the program that led to greater conceptual understanding.
In contrast to cost-effectiveness analysis, cost-benefit analysis involves dollar-to-dollar comparisons, both short-term and long-term. That is, instead of examining the estimated educational outcome effect of implementing and maintaining a given option, cost-benefit analysis examines the economic effects. But like cost-efficiency analysis, cost-benefit analysis requires comparing alternatives:
Cost of implementation and maintenance of option A
Estimated economic benefit (or dollar savings) of option A
Cost of implementation and maintenance of option B
Estimated economic benefit (or dollar savings) of option B
Again, the baseline option is generally the status quo, which is not assumed automatically to be the worst possible alternative. Cost-benefit analysis can be used to search for immediate, or short-term, cost savings. A school in need of computers might, for example, use this approach in deciding whether to buy or lease them or it may use the approach to decide whether to purchase buses or contract out busing services. For a legitimate comparison, one must assume that the quality of service remains constant. Using these examples, the assumption would be that the quality of busing or computers is equal if purchased, leased or contracted, including service, maintenance and all related issues. All else being equal, if the expenses incurred under one option are lower than under another, that option produces cost savings. As we will demonstrate later, this sort of example applies to a handful of recommendations presented on the Department of Education’s website.
Cost-benefit analysis can also be applied to big-picture education policy questions, such as comparing the costs of implementing major reform strategies such as class-size reduction or early childhood programs versus raising existing teachers’ salaries or measuring the long-term economic benefits of those different programmatic options. This is also referred to as return-on-investment analysis.
While cost-effectiveness and cost-benefit analyses are arguably under-used in education policy research, there are a handful of particularly useful examples:
- Determining whether certain comprehensive school reform models are more cost-effective than others?[ii]
- Determining whether computer-assisted instruction is more cost-effective than alternatives such as peer tutoring?[iii]
- Comparing National Board Certification for teachers to alternatives in terms of estimated effects and costs.[iv]
- Cost-benefit analysis has been used to evaluate the long-term benefits, and associated costs, of participation in certain early-childhood programs.[v]
Another useful example is provided by a recent policy brief prepared by economists Brian Jacob and Jonah Rockoff, which provides insights regarding the potential costs and benefits of seemingly mundane organizational changes to the delivery of public education, including (a) changes to school start times for older students, based on research on learning outcomes by time of day; (b) changes in school-grade configurations, based on an increased body of evidence relating grade configurations, location transitions and student outcomes; and (c) more effective management of teacher assignments.[vi] While the authors do not conduct full-blown cost effectiveness or cost-benefit analyses, they do provide guidance on how pilot studies might be conducted.
As explained above, cost-benefit and cost-effectiveness analyses require analysts to isolate specific reform strategies in order to correspondingly isolate and cost the strategies’ components and estimate their effects. In contrast, relative-efficiency analyses focus on the production efficiency or cost efficiency of organizational units (such as schools or districts) as a whole. In the U.S. public education system, there are approximately 100,000 traditional public schools in roughly 15,000 traditional public school districts, plus 5,000 or so charter schools. Accordingly, there is significant and important variation in the ways these schools get things done. The educational status quo thus entails considerable variation in approaches and in quality, as well as in the level and distribution of funding and the population served.
Each organizational unit, be it a public school district, a neighborhood school, a charter school, a private school, or a virtual school, organizes its human resources, material resources, capital resources, programs, and services at least marginally differently from all others. The basic premise of using relative efficiency analyses to evaluate education reform alternatives is that we can learn from these variations. This premise may seem obvious, but it has been largely ignored in recent policymaking. Too often, it seems that policymakers gravitate toward a policy idea without any empirical basis, assuming that it offers a better approach despite having never been tested. It is far more reasonable, however, to assume that we can learn how to do better by (a) identifying those schools or districts that do excel, and (b) evaluating how they do it. Put another way, not all schools in their current forms are woefully inefficient, and any new reform strategy will not necessarily be more efficient. It is sensible for researchers and policymakers to make use of the variation in those 100,000 schools by studying them to see what works and what does not. These are empirical questions, and they can and should be investigated.
Efficiency analysis can be viewed from either of two perspectives: production efficiency or cost efficiency. Production efficiency (also known as “technical efficiency of production”) measures the outcomes of organizational units such as schools or districts given their inputs and given the circumstances under which production occurs. That is, which schools or districts get the most bang for the buck? Cost efficiency is essentially the flip side of production efficiency. In cost efficiency analyses, the goal is to determine the minimum “cost” at which a given level of outcomes can be produced under given circumstances. That is, what’s the minimum amount of bucks we need to spend to get the bang we desire?
In either case, three moving parts are involved. First, there are measured outcomes, such as student assessment outcomes. Second, there are existing expenditures by those organizational units. Third, there are the conditions, such as the varied student populations, and the size and location of the school or district, including differences in competitive wages for teachers, health care costs, heating and cooling costs, and transportation costs.
It is important to understand that all efficiency analyses, whether cost efficiency or production efficiency, are relative. Efficiency analysis is about evaluating how some organizational units achieve better or worse outcomes than others (given comparable spending), or how or why the “cost” of achieving specific outcomes using certain approaches and under some circumstances is more or less in some cases than others. Comparisons can be made to the efficiency of average districts or schools, or to those that appear to maximize output at given expense or minimize the cost of a given output. Efficiency analysis in education is useful because there are significant variations in key aspects of schools: what they spend, who they serve and under what conditions, and what they accomplish.
Efficiency analyses involve estimating statistical models to large numbers of schools or districts, typically over multiple years. While debate persists on the best statistical approaches for estimating cost efficiency or technical efficiency of production, the common goal across the available approaches is to determine which organizational units are more and less efficient producers of educational outcomes. Or, more precisely, the goal is to determine which units achieve specific educational outcomes at a lower cost.
Once schools or districts are identified as more (or less) efficient, the next step is to figure out why. Accordingly, researchers explore what variables across these institutions might make some more efficient than others, or what changes have been implemented that might have led to improvements in efficiency. Questions typically take one of two forms:
- Do districts or schools that do X tend to be more cost efficient than those doing Y?
- Did the schools or districts that changed their practices from X to Y improve in their relative efficiency compared to districts that did not make similar changes?
That is, the researchers identify and evaluate variations across institutions, looking for insights in those estimated to be more efficient, or alternatively, evaluating changes to efficiency in districts that have altered practices or resource allocation in some way. The latter approach is generally considered more relevant, since it speaks directly to changing practices and resulting changes in efficiency.[vii]
While statistically complex, efficiency analyses have been used to address a variety of practical issues, with implications for state policy, regarding the management and organization of local public school districts:
- Investigating whether school district consolidation can cut costs and identifying the most cost-efficient school district size.[viii]
- Investigating whether allocating state aid to subsidize property tax exemptions to affluent suburban school districts compromises relative efficiency.[ix]
- Investigating whether the allocation of larger shares of school district spending to instructional categories is a more efficient way to produce better educational outcomes.[x]
- Investigating whether decentralized governance of high schools improves efficiency.[xi]
These analyses have not always produced the results that policymakers would like to hear. Further, like many studies using rigorous scholarly methods, these analyses have limitations. They are necessarily constrained by the availability of data, they are sensitive to the quality of data, and they can produce different results when applied in different settings.[xii] But the results ultimately produced are based on rigorous and relevant analyses, and the U.S. Department of Education should be more concerned with rigor and relevance than convenience or popularity.
[i] Levin, H. M. (1983). Cost-Effectiveness. Thousand Oaks, CA: Sage.
Levin, H. M., & McEwan, P. J. (2001). Cost effectiveness analysis: Methods and applications. 2nd ed. Thousand Oaks, CA: Sage.
[ii] Borman, G., & Hewes, G. (2002). The long-term effects and cost-effectiveness of Success for All. Educational Evaluation and Policy Analysis, 24, 243-266.
[iii] Levin, H. M., Glass, G., & Meister, G. (1987). A cost-effectiveness analysis of computer assisted instruction. Evaluation Review, 11, 50-72.
[iv] Rice, J. K., & Hall, L. J. (2008). National Board Certification for teachers: What does it cost and how does it compare? Education Finance and Policy, 3, 339-373.
[v] Barnett, W. S., & Masse, L. N. (2007). Comparative Benefit Cost Analysis of the Abecedarian Program and its Policy Implications. Economics of Education Review, 26, 113-125.
[vi] See Jacob, B., & Rockoff, J. (2011). Organizing Schools to Improve Student Achievement: Start Times, Grade Configurations and Teacher Assignments. The Hamilton Project. Retrieved November 6, 2011 from http://www.hamiltonproject.org/files/downloads_and_links/092011_organize_jacob_rockoff_paper.pdf
See also Patrick McEwan’s review of this report:
McEwan, P. (2011). Review of Organizing Schools to Improve Student Achievement. Boulder, CO: National Education Policy Center. Retrieved December 2, 2011 from http://nepc.colorado.edu/thinktank/review-organizing-schools
[vii] Numerous authors have addressed the conceptual basis and empirical methods for evaluating technical efficiency of production and cost efficiency in education or government services more generally. See, for example:
Bessent, A. M., & Bessent, E. W. (1980). Determining the Comparative Efficiency of Schools through Data Envelopment Analysis, Education Administration Quarterly, 16(2), 57-75.
Duncombe, W., Miner, J., & Ruggiero, J. (1997). Empirical Evaluation of Bureaucratic Models of Inefficiency, Public Choice, 93(1), 1-18.
Duncombe, W., & Bifulco, R. (2002). Evaluating School Performance: Are we ready for prime time? In William J. Fowler, Jr. (Ed.), Developments in School Finance, 1999–2000, NCES 2002–316.Washington, DC: U.S. Department of Education, National Center for Education Statistics.
Grosskopf, S., Hayes, K. J., Taylor, L. L., & Weber, W. (2001). On the Determinants of School District Efficiency: Competition and Monitoring. Journal of Urban Economics, 49, 453-478.
[viii] Duncombe, W. & Yinger, J. (2007). Does School District Consolidation Cut Costs? Education Finance and Policy, 2(4), 341-375.
[ix] Eom, T. H., & Rubenstein, R. (2006). Do State-Funded Property Tax Exemptions Increase Local Government Inefficiency? An Analysis of New York State’s STAR Program. Public Budgeting and Finance, Spring, 66-87.
[x] Taylor, L. L., Grosskopf, S., & Hayes, K. J. (2007). Is a Low Instructional Share an Indicator of School Inefficiency? Exploring the 65-Percent Solution. Working Paper.
[xi] Grosskopf, S., & Moutray, C. (2001). Evaluating Performance in Chicago Public High Schools in the Wake of Decentralization. Economics of Education Review, 20, 1-14.
[xii] See, for example, Duncombe, W., & Bifulco, R. (2002). “Evaluating School Performance: Are we ready for prime time?” In William J. Fowler, Jr. (Ed.), Developments in School Finance, 1999–2000, NCES 2002–316. Washington, DC: U.S. Department of Education, National Center for Education Statistics.