You can download a pdf version of this article at https://eric.ed.gov/?id=ED603587. Please note that this web version was updated later with the following reference added.
Biancarosa, Gina, Anthony S. Bryk, and Emily R. Dexter. “Assessing the Value-Added Effects of Literacy Collaborative Professional Development on Student Learning.” Elementary School Journal 111, no. 1 (September 2010): 7–34. https://doi.org/10.1086/653468.
Return on investment (ROI) is a concept that originated in the business world in the early twentieth century. According to the Hagley Museum and Library, the concept and the formula were first developed in 1914 by Donaldson Brown, the Assistant Treasurer of the DuPont Company, for monitoring business performance as the company was grappling with diversifying from explosives to lacquers, Pyralin plastics and dyes. ROI was quickly adopted by the DuPont Company as their primary performance measure for all of its departments and later required for all capital appropriations and projects submitted for approval by the company’s senior management.
Because of its simplicity and versatility, ROI became widely adopted by industries. Applying the ROI concept to education, Levenson  coined the term Academic Return on Investment (A-ROI) and called for public school systems to use A-ROI to drive decisions on resource use. A-ROI has received growing attention in recent years, especially with the implementation of the Every Student Succeed Act (ESSA) that promotes using evidence to strengthen education investments .
The A-ROI calculated by Levenson’s formula is essentially a reversed cost-effectiveness ratio (CER). While absolutely important and necessary, the rigor required for conducting such analysis is simply beyond the reach of most school districts that have neither the capacity nor the resources necessary to do such work. In addition, the amount of work and time as well as manpower needed for such analysis makes it very difficult to evaluate a reasonably large number of investment items, which is not uncommon in large urban school districts. At the same time, there are a number of methodological issues in applying A-ROI for high-stakes budgetary decisions, which could lead to wrong decisions but have not received enough attention among practitioners.
In this series, I discuss five issues around using A-ROI for informing and improving decisions. The first part focuses on the validity of A-ROI as a cost-effectiveness measure of investments. Critiques are welcomed.
Conceptually, the validity of A-ROI for cost-effectiveness comparisons depends on two primary assumptions regarding the time needed to observe an effect and the causal inferences drawn from observed effects, which have implications for the proper time to calculate or recalculate an investment’s A-ROI and appropriateness of using the A-ROI result for decision making.
The first assumption is that each investment has a substantive core that remains unchanged in each iteration of its implementation. In education, however, it is widely known that it takes time for investments to be rolled out, take shape, and become institutionalized. As a result, the timing of A-ROI calculation matters because the results could differ, and more importantly, carry different meanings depending on the timing. An early calculation of A-ROI before an investment is fully implemented will likely not be the A-ROI that is of interest after both the effect and cost stabilize (Stabilized effect and cost could still randomly vary. How to detect stabilization of effect and cost is another topic that needs further investigations).
Depending on the situation, the premature calculation could either under- or over-estimate an investment’s true effect. For simplicity, Figure 1 shows four hypothetical investment scenarios, with each effect trend line representing a possible path to maturation an investment could take during the first five years of implementation.
Among the four investments shown in Figure 1, Investment C has an impact at the end of the first year implementation and the effect remains unchanged subsequently. For Investment B, however, its full effect is not seen for the first three years, which could happen if the effect magnitude is correlated to the level or phase of implementation, or if the effect lags behind the implementation. As a result, the A-ROI result calculated before Year 3 will under-estimate the true effect. For example, in a four-year longitudinal study of Literacy Collaborative, one of the largest and oldest literacy coaching programs in the US, Biancarosa et al. (2010) found that a small on-average positive effect detected in the first year as novice coaches began to take up their new roles more than doubled over the next two years.
When it comes to Investment D, the effect peaks at the end of first year implementation, wanes in the following year, and remains unchanged after that. A typical example of Investment D is technology intervention programs, where the early strong results are largely due to the novelty effect of new technology. In this case, the A-ROI calculated at the end of the first year implementation will over-estimate the true effect.
Finally, the A-ROI result for Investment A at the end of the first year would show a negative effect despite the investment having the largest effect after three years of implementation. This is possible when the new program requires an approach that is very different from the conventions, leading to disruptions or confusion in the early implementation (It should be noted that estimating mature A-ROI with stabilized effects and costs introduces the possibility of the A-ROI result being confounded by other programs implemented during the maturation period).
Similarly, costs can be over- or under-estimated. When a program is implemented in phases with different cost elements introduced at each phase, early calculations of A-ROI may under-estimate the total cost. Under-estimation of costs could also arise when implementation necessitates additional cost that is not anticipated during a program’s planning process. For example, a program that has received funding to hire more minority teachers as part of a larger literacy program might run into difficulty in recruiting minority teachers and thus need additional funding for that particular aspect. Over-estimation of costs tends to happen with economies of scale. That is, the cost remains unchanged when the number of participants increases or, more generally, when participation growth outpaces cost increases.
Table 1 shows the potential bias in the A-ROI results when over- and/or under-estimation occurs with the effect, cost, or both.
It is important to point out that the above discussion should not prevent A-ROI from being calculated for an investment during the first a few years of its implementation. There are practical reasons of why stakeholders would demand the A-ROI information soon after an investment is implemented. For examples, district leaders want to know whether adjustments are needed to improve implementation; board members are interested in making sure that the investment they approved is producing results. However, it does point to the need to communicate to stakeholders how the A-ROI information should be interpreted and used for decision making.
Once A-ROI is calculated for an investment with stable effects and costs, that measure cannot be uncritically used year after year, because of the ever-changing nature of educational contexts. For example, in public schools, it is not uncommon to see leadership changes, especially in large urban school districts where the superintendent tenure averages three years.
Following a leadership change, existing programs typically face two potential outcomes. One is that the program no longer has an owner and champion, which then leads to a decreased level of attention and accountability, and eventually deteriorating implementation. The other is that the new leader takes the rein of an inherited program and significantly reshapes it to fit his or her vision, philosophy and approach. In either case, the substantive core probably is likely to change during the transition although the program often carries the same name. This raises the question of whether the previous measure of A-ROI for such investments should be re-calculated after a transition.
The second assumption on which the validity of A-ROI rests is that the return is truly due to the investment, which is at the core of rigorous evaluations of the investments. That is, collecting evidence that rules out alternative explanations of the observed return. This is important because a number of factors may impact investment outcomes, such as natural trends among participants, selection bias whereby gains or losses are really associated with participant characteristics rather than the program, or confounding effects from other programs implemented at the same time to improve student achievement.
Table 2 summarizes the nine possible pairs of true and estimated effects. Theoretically, all colored cells in Table 2 represent biased or potentially biased scenarios, all of which should be guarded against through “strong evidence” from well-designed and well-implemented experimental study, or “moderate evidence” from well-designed and well-implemented quasi-experimental study defined by the US Department of Education (2016). Practically, however, they have different implications and significances for gathering and using evidence to facilitate decision making.
For scenarios represented by red cells, the biased estimates will direct leaders to make wrong decisions and should definitely be avoided. In P-Z and P-N situations, leaders will be inclined to end effective programs, and in N-P situations, deleterious programs will likely be continued. In those cases, reliance on A-ROI should be avoided as decisions may directly harm students. For yellow cell (Z-P) scenarios, the biased estimate does not harm students directly, but may result in wasting money that could have been spent on effective programs. Here again, decisions relying on the biased estimates are not warranted. For green cells scenarios, the biased estimates will prompt leaders to make right decisions and are thus acceptable despite the bias. The different significance levels of bias and their implications for decision making are summarized in Table 3.
Undoubtedly, “strong evidence” and “moderate evidence” should be pursued whenever possible and feasible, which gives us the best chance of guarding against the biases represented in the colored cells in Table 2. However, as pointed out earlier, that is not feasible for most school districts, due to the complexity as well as time and effort required to conduct well-designed and well-implemented experimental and quasi-experimental studies.
For practical purpose, our goal is to develop an A-ROI method that provides different levels of protection against various biases that arise as the result of the compromise between rigor and practicality. For example, for any given estimated negative result, the A-ROI method should provide a high-level of confidence concerning the unlikelihood that the program actually has a positive effect (P-N cell). If the possibility of a true positive effect is ruled out with high confidence, the burden to produce evidence for possible Z-N or N-N results should be less demanding.
For situations represented by cell P-P, there is another level of bias that warrants close attention. That is, true effect E of a program is either under-estimated or over-estimated by E hat, which will lead to an unfavorable or favorable assessment, respectively, when it is compared with other programs with unbiased estimates of the investment impact.
This second level of bias seems harder to protect against and requires more rigorous evidence than protection against the biases presented in Table 2. Nevertheless, the point here is that instead of treating all biases as equally unacceptable, for practical purpose, the biases can be differentiated with regard to their implications for decision making and significances for producing evidence.