The Case for Annual Testing

The new U.S. Congress is moving post haste to reauthorize the Elementary and Secondary Education Act (ESEA). With Republicans in the majority in both houses and the relevant committees chaired by individuals with considerable legislative skills (Lamar Alexander in the Senate and John Kline in the House) the smart money is on Obama seeing a bill in this session.

The most recent incarnation of ESEA, signed into law in January of 2002 by President George W. Bush, is the No Child Left Behind Act (NCLB).  NCLB is the seventh reauthorization of ESEA since 1965, which means that Congress historically reworked this legislation roughly every five years.  We’re now 13 years into NCLB, so reauthorization is long overdue. It is not just the long delay that argues for congressional action, but the extent to which the Obama administration has replaced the provisions of the bill with its own set of priorities implemented through Race to the Top and state waivers.  Whatever one thinks of the appropriate federal role in education, there are surely strong reasons in our constitutional democracy to prefer that we get to where we are going through law rather than executive edict.

That said, this is a perilous moment for reauthorization because of the backlash against standards, testing, and accountability.  The effort to put “the standardized testing machine in reverse,” in the words of New York mayor Bill de Blasio, has diverse bastions of support.  These include: conservatives who object to the seemingly ever expanding reach of the federal government into K-12 public education; concerned parents of children in well-regarded, often suburban schools, who believe that test-prep activities have narrowed the curriculum and put undesirable pressure on their children; progressives such as de Blasio, who see the challenges of public education as best addressed by more funding for schools and broad efforts to eliminate poverty rather than by holding schools or teachers accountable for results; and, teacher unions that are doing what unions are expected to do by trying to protect the less effective of their members from the consequences that follow from exposing their ineptitude in the classroom.

Conservatives, progressives, concerned parents, and unions: That is a formidable set of opponents to standards, testing, and accountability. You would expect these groups to have captured the attention of elected representatives in Washington.  And they have.  Insiders believe that the draft ESEA reauthorization bills that are afoot in the Senate and House will do away with the NCLB requirement that states carry out annual testing of all children in grades 3-8 in reading and mathematics. 

There are many things that need fixing in NCLB, certainly including its unrealistic accountability provisions (under which every child in the nation was expected to be proficient in reading and math by last year and schools were threatened with restructuring for failing to make adequate progress towards this goal).  But it would be a serious mistake for Congress to treat standards, testing, and accountability as a single target to be taken out with a shotgun blast.  Each member of the triumvirate can exist on its own and has a different impact on school performance.  And each has a different rationale and political basis in our federal system of education. 

What follows is a case for retaining in ESEA annual testing requirements that produce  information on the growth in student achievement from one year to the next, while eliminating most of the provisions related to standards and accountability, control over which would revert to the states and school districts.  The argument has four parts: 1) federal control of standards and accountability is unnecessary whereas the provision of information on the performance of schools is a uniquely federal responsibility; 2) test scores are valid indicators of student learning that matters for important long-term outcomes and therefore provide essential information on school performance; 3) several important functions for managing and improving education depend on annual measures of student achievement growth; 4) most of the political opponents of standards, testing, and accountability should favor the retention of annual testing shorn of federally dictated standards and accountability.

Federal control of standards and accountability is unnecessary

Differences in the quality and rigor of standards for what students should learn are unlikely to have a substantial effect on academic achievement (as our colleague Tom Loveless has shown) and, in any case, do not require federal involvement.  The Common Core, for example, was initiated by states operating through the National Governors Association.  Since there is no imperative for federal involvement either from the perspective of constitutional responsibility or from the functional perspective of unmet need, there is no reason for requirements for state standards to be part of the ESEA reauthorization.

Federal school-based accountability is different in that the evidence demonstrates that it has had meaningful albeit relatively modest impacts on student learning, concentrated in mathematics in the worst schools and for the lowest performing students.  Roughly half the states had consequential accountability systems in place prior to NCLB, and all do today.  The function of NCLB’s accountability mandates can now be carried out by states through their own systems, perhaps supplemented by a limited federal focus on schools that fail at basic functions, e.g., schools in which significant percentages of students do not acquire basic competencies in reading and math in elementary and middle school, or do not graduate from high school. States could comply with such a federal requirement by identifying a basic competency cutoff score on their state test.  Such a cutoff would be lower than current proficiency targets and realistically obtainable for nearly all students.  This would allow schools in which nearly all children will acquire basic competencies as a matter of course to focus their instructional efforts where they wish without fear of running afoul of the federal accountability system. 

The provision of valid and actionable information on school performance is a uniquely federal responsibility

Information on school performance in education is a public good, meaning that individuals cannot be effectively excluded from using the information once it exists. Because it is impossible to prevent consumers who have not paid for the information from consuming it, far too little evidence will be produced if it is not required by the federal government.  Further, only local authorities can collect information on school performance from test scores and other local data, but their narrow self-interests are not usually served by making that information easily accessible and useable by the public.  Only federal requirements will achieve that end.   Finally, evidence on school performance does not merely need to be produced; it needs to be of high quality.  But gathering and auditing data are almost pure public services. That is why even when information on school or company performance is treated as a private good to support more informed consumer choice (e.g., college search sites that require a fee for access, or stock market services that sell advice on individual stocks), the information that customers pay to access is derived overwhelmingly from federal sources. In short, federal support for gathering and disseminating information on school performance is easy to justify.  If the federal government doesn’t support it, it will not happen.

Student learning impacts long-term outcomes that everyone should value, and test scores are valid indicators of such learning

Scores that students receive on standardized tests administered in schools are strongly predictive of later life outcomes that are of great value to those students and the nation, after controlling for all the other observable characteristics of those students that are associated with later success.  What’s more, gains in test scores that result from interventions such as being assigned to a particularly effective teacher or attending a school facing accountability pressure also predict improvements in adult outcomes.  In other words, how much students learn in school makes a big difference in their lives, and standardized tests capture valid information on this.  As such, information on school performance that does not include information on student learning as measured by standardized tests will be badly compromised, like information on the performance of a publicly traded stock that does not include its historical returns.

Recent work by economists Raj Chetty, John Friedman, and Jonah Rockoff on teacher effectiveness utilizes data from test score data in reading and math in grades 3-8 in New York City linked to IRS records for the same students as they became adults.  Our focus here is on these linked records and what they tell us about the predictive power of test scores, rather than on the story they tell about teacher effectiveness that was the focus of the Chetty et al. study.  The school records of the study sample provide test scores as well as a rich set of control variables, including student variables (e.g., gender, ethnicity, special education status, record of suspensions, and limited English proficiency) and school variables (e.g., class size, teacher experience).  The tax records include individual earnings, information on college attendance, and child dependents (from which mothers who were teenagers when they gave birth could be identified).

Without controlling for other student characteristics and school variables the association between student test scores in grades 3-8 and later outcomes is huge, but it could reflect, for example, the impact of ethnicity or limited English proficiency independent of student learning.  With all controls in place the most important of these omitted variables are accounted for.  The association is still very large and most plausibly a function of the academic knowledge that is being assessed on the standardized tests.

The figure below (generated by the authors of this piece from information in Appendix Table 3 of Chetty et al.) represents the relationship between earnings at age 28 for students from the 5th to the 95th percentile in test scores in reading and math in grades 3-8, after adjusting earnings for the influence of all the control variables mentioned above.

Association between student test scores in grades 3-8 and earnings at age 28 
(with earnings adjusted for student and school characteristics)

scores vs earnings

To compare two points on the graph: relative to individuals who as students were at the 30th percentile, individuals who were at 70th percentile in test scores in elementary and middle school were earning 13.6% more as young adults.  To repeat, the association is net of the other variables that entered into the prediction as controls.  Some of these controls, such as special education status, capture the impact of student knowledge as measured on standardized tests, and thus bias downward the association that is represented in Figure 1.  Nevertheless, the estimated impact on earnings is very large. 

Similarly large impacts are found on college attendance, the quality of college attended, the quality of the neighborhood of residence, and giving birth as a teenager among females.  Some of these relationships surely reflect the impact of students’ innate ability on their adult outcomes, but many studies have found that interventions that impact test scores also have impacts on later-life outcomes, including class-size reduction, school accountability, charter schools, and exposure to highly effective teachers. Given these strong predictive and causal relationships, who among us that wants to improve education would choose to ignore how much students are learning in school as measured by standardized tests?  Unless the federal government mandates the collection and dissemination of this information, we will be bereft of one of the best indicators we currently have of school performance.

Many school management and improvement functions depend on annual measures of student growth

Test scores matter for any form of accountability, including market-based accountability in which parents choose the schools their children attend and funding follows students to their school of choice.  Proponents of charter schools and open-enrollment in public schools will find that the informational fuel of their favored version of school reform will evaporate without valid information on annual student gains.

The removal of the requirement of annual testing will, necessarily, all but eliminate school-based accountability for the learning of subgroups of children because, as Whitehurst and Lindquist have shown, testing only samples of children or only one grade of children often leads to sample sizes for subgroups such as English learners and blacks that are too small to generate reliable estimates for the school as a whole.  Thus, those concerned with equity should strongly support annual testing in multiple grades.

Unless individual students are tested in adjacent grades as they move through school, which requires annual testing, it is impossible to measure gains in student achievement from one year to the next.  This has three consequences.  

First, schools that serve a disproportionate share of disadvantaged children won’t be credited for their success in improving the academic abilities of their students because improvement won’t be visible, only status.  Thus, if children are only tested in 6th grade, the elementary school that moves its students from the 10th percentile in math to the 40th percentile from 3rd to 6th grade will look exactly the same as the school whose students performed at the 50th percentile in grade three and fell to the 40th percentile in grade six. 

For the same reasons, it will be impossible to differentiate teachers based on their ability to generate gains in student learning in their classrooms.  Such value-added measures require a test-based estimate of the difference between how much math or language skills individual students have when entering vs. graduating from a teacher’s classroom.  This requires annual testing.  We have learned over the last decade just how important differences in teacher effectiveness are to student outcomes.  The ability to collect and use this information to support improvements in teacher preparation, professional development, and personnel actions will be lost without annual testing.

Finally, eliminating annual testing would prevent researchers and policymakers from judging the effectiveness of new education programs in which the research design depends on knowledge of students’ recent achievement. By hampering our ability to learn about what’s working and what’s not, jettisoning annual testing would have a negative effect on the rate of improvement in achievement over time. 

Most of the opponents of federally imposed standards, testing, and accountability should be in favor of federally imposed annual testing shorn of standards and accountability

Conservatives, generally, want to rein in federal control of education while driving bottom-up reforms by empowering parents with greater choice of where to send their children to school.  Choice is empty without valid information on school performance (like going online to choose a restaurant for dinner and finding no reviews), and student learning is the most critical school function on which customers need performance data.  Conservatives should favor a federal role in collecting and disseminating this information.  And it doesn’t have to be the same test across the nation to provide this information, or even a single end-of-the-year test as opposed to a series of tests given across the year that can be rolled-up into an estimate of annual growth.  All that is required is something that tests what a school intends to teach and is normed to a state or national population.

Progressives have a strong commitment to educational equity and adequacy for historically disadvantaged populations.  They think that funding is critical, but nearly all understand that how the money is spent and to what ends is equally important.  One of the undeniable successes of NCLB was to expose to public scrutiny the failures of many of our public schools to adequately educate disadvantaged subgroups.  If information on student learning from annual testing disappears, so too will the attention to the needs of subgroups that are illuminated through annual testing.  Progressives should support annual testing for reasons of equity.

Concerned parents are reacting to test prep regimes for annual tests, not the tests themselves (which take no more than a day of school time to administer).  If the federal targets for test scores and associated sanctions are jettisoned, so too should much of the test prep regime.  Test scores become, then, one among several forms of information on school performance that parents should value and consume.  Parents who are concerned about their children’s schooling should want to know how their school of choice is performing on state tests, as well as the satisfaction of parents and students who are served by the school, the experience and effectiveness of its teachers, the extent to which the school prepares its students for the next step in their education journey, the school’s extracurricular activities and degree of student engagement, and  other factors that people care about and can be made available for public scrutiny.  Surely, such parents no more want to be in the dark about a K-12 school’s academic performance than they would want to ignore the quality of the college to which their child will eventually seek enrollment.

Teacher unions may be a lost cause on annual testing because of the harsh stance they have already taken and their awareness that information on individual differences in teacher effectiveness is a powerful lever that doesn’t require a federal accountability mandate to be put to use by reform-oriented school districts.  But even they may see value in a horse trade in which Congress eliminates federal requirements for states to evaluate teachers based on test scores but retains annual testing.  

The performance of the nation’s education system is critical to our future and to the lives of the students who experience it.  The fundamental responsibility of schools is student learning.  Valid estimates of student learning that strongly predict later life outcomes can be derived from annual academic tests.  Much depends on the continued collection and dissemination of such information.  Only the federal government is in a position to see that it happens.  Congress can reauthorize ESEA, retain the requirement for annual tests that yield measures of student growth, and satisfy a diverse set of political factions if it focuses on its responsibility to see that valid information on school performance is available for all to use while pulling back from previous efforts to insert the U.S. Department of Education into roles that were previously reserved to states and school districts.