As we reach the two-year mark of the initial wave of pandemic-induced school shutdowns, academic normalcy remains out of reach for many students, educators, and parents. In addition to surging COVID-19 cases at the end of 2021, schools have faced severe staff shortages, high rates of absenteeism and quarantines, and rolling school closures. Furthermore, students and educators continue to struggle with mental health challenges, higher rates of violence and misbehavior, and concerns about lost instructional time.
As we outline in our new research study released in January, the cumulative impact of the COVID-19 pandemic on students’ academic achievement has been large. We tracked changes in math and reading test scores across the first two years of the pandemic using data from 5.4 million U.S. students in grades 3-8. We focused on test scores from immediately before the pandemic (fall 2019), following the initial onset (fall 2020), and more than one year into pandemic disruptions (fall 2021).
Average fall 2021 math test scores in grades 3-8 were 0.20-0.27 standard deviations (SDs) lower relative to same-grade peers in fall 2019, while reading test scores were 0.09-0.18 SDs lower. This is a sizable drop. For context, the math drops are significantly larger than estimated impacts from other large-scale school disruptions, such as after Hurricane Katrina—math scores dropped 0.17 SDs in one year for New Orleans evacuees.
Even more concerning, test-score gaps between students in low-poverty and high-poverty elementary schools grew by approximately 20% in math (corresponding to 0.20 SDs) and 15% in reading (0.13 SDs), primarily during the 2020-21 school year. Further, achievement tended to drop more between fall 2020 and 2021 than between fall 2019 and 2020 (both overall and differentially by school poverty), indicating that disruptions to learning have continued to negatively impact students well past the initial hits following the spring 2020 school closures.
These numbers are alarming and potentially demoralizing, especially given the heroic efforts of students to learn and educators to teach in incredibly trying times. From our perspective, these test-score drops in no way indicate that these students represent a “lost generation” or that we should give up hope. Most of us have never lived through a pandemic, and there is so much we don’t know about students’ capacity for resiliency in these circumstances and what a timeline for recovery will look like. Nor are we suggesting that teachers are somehow at fault given the achievement drops that occurred between 2020 and 2021; rather, educators had difficult jobs before the pandemic, and now are contending with huge new challenges, many outside their control.
Clearly, however, there’s work to do. School districts and states are currently making important decisions about which interventions and strategies to implement to mitigate the learning declines during the last two years. Elementary and Secondary School Emergency Relief (ESSER) investments from the American Rescue Plan provided nearly $200 billion to public schools to spend on COVID-19-related needs. Of that sum, $22 billion is dedicated specifically to addressing learning loss using “evidence-based interventions” focused on the “disproportionate impact of COVID-19 on underrepresented student subgroups.” Reviews of district and state spending plans (see Future Ed, EduRecoveryHub, and RAND’s American School District Panel for more details) indicate that districts are spending their ESSER dollars designated for academic recovery on a wide variety of strategies, with summer learning, tutoring, after-school programs, and extended school-day and school-year initiatives rising to the top.
Comparing the negative impacts from learning disruptions to the positive impacts from interventions
To help contextualize the magnitude of the impacts of COVID-19, we situate test-score drops during the pandemic relative to the test-score gains associated with common interventions being employed by districts as part of pandemic recovery efforts. If we assume that such interventions will continue to be as successful in a COVID-19 school environment, can we expect that these strategies will be effective enough to help students catch up? To answer this question, we draw from recent reviews of research on high-dosage tutoring, summer learning programs, reductions in class size, and extending the school day (specifically for literacy instruction). We report effect sizes for each intervention specific to a grade span and subject wherever possible (e.g., tutoring has been found to have larger effects in elementary math than in reading).
Figure 1 shows the standardized drops in math test scores between students testing in fall 2019 and fall 2021 (separately by elementary and middle school grades) relative to the average effect size of various educational interventions. The average effect size for math tutoring matches or exceeds the average COVID-19 score drop in math. Research on tutoring indicates that it often works best in younger grades, and when provided by a teacher rather than, say, a parent. Further, some of the tutoring programs that produce the biggest effects can be quite intensive (and likely expensive), including having full-time tutors supporting all students (not just those needing remediation) in one-on-one settings during the school day. Meanwhile, the average effect of reducing class size is negative but not significant, with high variability in the impact across different studies. Summer programs in math have been found to be effective (average effect size of .10 SDs), though these programs in isolation likely would not eliminate the COVID-19 test-score drops.
Figure 1: Math COVID-19 test-score drops compared to the effect sizes of various educational interventions
Source: COVID-19 score drops are pulled from Kuhfeld et al. (2022) Table 5; reduction-in-class-size results are from pg. 10 of Figles et al. (2018) Table 2; summer program results are pulled from Lynch et al (2021) Table 2; and tutoring estimates are pulled from Nictow et al (2020) Table 3B. Ninety-five percent confidence intervals are shown with vertical lines on each bar.
Notes: Kuhfeld et al. and Nictow et al. reported effect sizes separately by grade span; Figles et al. and Lynch et al. report an overall effect size across elementary and middle grades. We were unable to find a rigorous study that reported effect sizes for extending the school day/year on math performance. Nictow et al. and Kraft & Falken (2021) also note large variations in tutoring effects depending on the type of tutor, with larger effects for teacher and paraprofessional tutoring programs than for nonprofessional and parent tutoring. Class-size reductions included in the Figles meta-analysis ranged from a minimum of one to minimum of eight students per class.
Figure 2 displays a similar comparison using effect sizes from reading interventions. The average effect of tutoring programs on reading achievement is larger than the effects found for the other interventions, though summer reading programs and class size reduction both produced average effect sizes in the ballpark of the COVID-19 reading score drops.
Figure 2: Reading COVID-19 test-score drops compared to the effect sizes of various educational interventions
Source: COVID-19 score drops are pulled from Kuhfeld et al. (2022) Table 5; extended-school-day results are from Figlio et al. (2018) Table 2; reduction-in-class-size results are from pg. 10 of Figles et al. (2018); summer program results are pulled from Kim & Quinn (2013) Table 3; and tutoring estimates are pulled from Nictow et al (2020) Table 3B. Ninety-five percent confidence intervals are shown with vertical lines on each bar.
Notes: While Kuhfeld et al. and Nictow et al. reported effect sizes separately by grade span, Figlio et al. and Kim & Quinn report an overall effect size across elementary and middle grades. Class-size reductions included in the Figles meta-analysis ranged from a minimum of one to minimum of eight students per class.
There are some limitations of drawing on research conducted prior to the pandemic to understand our ability to address the COVID-19 test-score drops. First, these studies were conducted under conditions that are very different from what schools currently face, and it is an open question whether the effectiveness of these interventions during the pandemic will be as consistent as they were before the pandemic. Second, we have little evidence and guidance about the efficacy of these interventions at the unprecedented scale that they are now being considered. For example, many school districts are expanding summer learning programs, but school districts have struggled to find staff interested in teaching summer school to meet the increased demand. Finally, given the widening test-score gaps between low- and high-poverty schools, it’s uncertain whether these interventions can actually combat the range of new challenges educators are facing in order to narrow these gaps. That is, students could catch up overall, yet the pandemic might still have lasting, negative effects on educational equality in this country.
Given that the current initiatives are unlikely to be implemented consistently across (and sometimes within) districts, timely feedback on the effects of initiatives and any needed adjustments will be crucial to districts’ success. The Road to COVID Recovery project and the National Student Support Accelerator are two such large-scale evaluation studies that aim to produce this type of evidence while providing resources for districts to track and evaluate their own programming. Additionally, a growing number of resources have been produced with recommendations on how to best implement recovery programs, including scaling up tutoring, summer learning programs, and expanded learning time.
Ultimately, there is much work to be done, and the challenges for students, educators, and parents are considerable. But this may be a moment when decades of educational reform, intervention, and research pay off. Relying on what we have learned could show the way forward.
The Brown Center Chalkboard launched in January 2013 as a weekly series of new analyses of policy, research, and practice relevant to U.S. education.
In July 2015, the Chalkboard was re-launched as a Brookings blog in order to offer more frequent, timely, and diverse content. Contributors to both the original paper series and current blog are committed to bringing evidence to bear on the debates around education policy in America.