The Resurgence of Ability Grouping and Persistence of Tracking

Part II of the 2013 Brown Center Report on American Education

browncenterpart2coverThis study examines the use of ability grouping and tracking in America’s schools. Recent NAEP data reveal a resurgence of ability grouping in fourth grade and the persistent popularity of tracking in eighth-grade mathematics. These trends are surprising considering the vehement opposition of powerful organizations to both practices. Although the current study will not delve into the debate—it is interested in what schools are doing, not why or whether they should do it—discussion is offered at the end of the article on implications of the findings for the controversy surrounding the topic.

Ability grouping and tracking are often confused. They both attempt to match students with curriculum based on students’ ability or prior performance, but the two practices differ in several respects. Tracking takes place between classes, ability grouping within classes. Tracking primarily occurs in high school and sometimes in middle school. In tracked academic subjects, students are assigned to different classrooms, receive instruction from different teachers, and study a different curriculum. The names of high school courses signal curricular differences. Advanced math students in tenth grade, for example, may take Algebra II while others take Geometry, Algebra I, or Pre-Algebra. Advanced tenth graders in English language arts (ELA) may attend a class called “Honors English” while other students attend “English 10” or “Reading 10.” Excellent science students may take “AP Chemistry” while others take a course simply called “Chemistry” or “General Science.” History may also be tracked, as when Advanced Placement courses are offered in U.S. or European history that not all students take. Some middle and high schools do not track at all, creating instead classes that are heterogeneous in ability. Students of all abilities study the same material.

What Tracking is Not

Perhaps the best way to clarify what tracking is, because of widespread misconceptions, is by describing what it is not. Tracking is decided subject by subject. Students are not assigned to college preparatory or vocational tracks that then dictate coursework all through high school; that practice died out in the U.S. in the late 1960s and early 1970s.11,12 European and Asian school systems still practice a form of this type of tracking (they call it “streaming”), typically in the final two or three years of secondary schooling.13 Students take placement exams and based on the scores are selected into separate schools with markedly different post-secondary destinations rather than attending different classes at the same school.14 Exam-based selection into high schools was common in the U.S. in the 19th century and the early part of the 20th century, but fell to the wayside. The comprehensive high school—with all students of a particular community attending the same school and then divided into distinct tracks within the school—came to be enshrined as the American model.

Ability Grouping

Ability grouping typically is an elementary school practice. Most elementary classes feature a single teacher with a classroom of students who are heterogeneous in ability. To create more homogeneity, teachers may divide students into small instructional groups reflecting different levels of ability, most often for reading in the primary grades (K–3) and perhaps for reading or math in later grades (4–6).15 While the teacher provides instruction to one group, the other students work independently—engaged in cooperative group activities or computer instruction or completing worksheets to reinforce skills. The teacher rotates among the groups so that each student receives a dose of teacher-led instruction in these small settings.

Researchers from Johns Hopkins conducted a comprehensive survey of ability grouping and tracking in 1986. The study analyzed national data augmented by an in depth survey of Pennsylvania schools. Several interesting patterns were uncovered that still hold true today. Disaggregating the data by grade level revealed that ability grouping is most prominent in first grade and then slowly recedes over subsequent grades. Ability grouping and tracking are inversely related; the school system’s strategies for creating groups that are as homogeneous as possible shift over the K-12 grade span. Tracking is rare in the elementary grades and, after increasing dramatically in middle school (in mathematics, in particular) peaks towards the end of high school. It is rare for students, once grouped between classes by tracking, to be grouped again within classes by ability grouping.16

Because the groupings are within-class (and often decided by a single teacher), ability grouping is more flexible than tracking. Groups may be reshuffled periodically to reflect changes in student performance. Ability groups might study from different levels of the same textbook series or use the same book and move at a different pace (with enrichment activities for the faster groups until the others catch up). Instead of the formality of transcript designations for high school courses, ability groups often take the names of animals—redbirds, bluebirds, sharks, dolphins, and the like—or the names of the books in the reading series that the students are using.

The most popular alternatives to ability-grouped instruction are whole class instruction, in which all students in the same classroom receive the same instruction, and the creation of small heterogeneous groups. Sometimes cooperative learning strategies are employed with heterogeneous groups, but cooperative learning can be used with any small group regardless of the criterion by which it is formed. Success for All, for example, is a popular program combining cooperative learning with small ability groups that are frequently reorganized to reflect student progress.17


In the 1970s and 1980s, a barrage of studies criticized tracking and ability grouping. Race and class figured prominently in the debate. Grouping students by ability, no matter how it is done, will inevitably separate students by characteristics that are correlated statistically with measures of ability, including race, ethnicity, native language, and class. Critics argued that tracking and ability grouping do not separate students into socioeconomic status-related groups by accident. Ray C. Rist’s “Self-Fullfilling Prophecy in Ghetto Education” (1970) followed a group of kindergarten students through the first few years of school and noted how the composition of reading groups rarely changed, consistently reflecting students’ socioeconomic status (SES).18 The SES differences are hardened, Rist argued, as teachers develop different expectations for groups of low and high performing students, even if those groups are given innocuous sounding names to mask their status.19 James Rosenbaum’s Making Inequality (1976) described working class youth at a New England high school who were channeled into vocational and remedial tracks that were nothing more than boring, academic dead ends.20

In 1985, Jeanie Oakes’ classic book, Keeping Track, was published. Oakes drew on data from several junior and senior high schools. Building on the social reproductionist theories of Samuel Bowles and Herbert Gintis’s Schooling in Capitalist America, Oakes argued that although tracking is typically justified by educators as a strategic response to student heterogeneity, the practice is undergirded by normative beliefs regarding race and class—and politically defended by white, middle-class parents to protect privilege. Black, Hispanic and poor children populate remedial classes; middle-class white children populate honors courses. Tracking and ability grouping are not mere bystanders to social injustice, Oakes and other critics charged. Such practices don’t just mirror the inequalities of the broader society. They reproduce and perpetuate inequality.21

This critique had a profound effect on policy and practice. In the 1990s, several prominent political organizations passed resolutions condemning tracking, including the National Governors Association, the American Civil Liberties Union, the Children’s Defense Fund, and the NAACP Legal Defense Fund. Some states urged schools to reduce tracking and ability grouping, most notably California and Massachusetts. A surprising implementation story ensued. Although the call to detrack was not accompanied by conventional incentives—the big budgets, regulatory regimes, and rewards and sanctions that draw the attention of policy analysts—detracking was, in a field famous for ignored or subverted policies, adopted by a large number of schools.22

Surveys of Ability Grouping

How much did ability grouping decline? A 1961 national survey revealed that about 80% of elementary schools grouped students by ability for reading instruction.23 A three-group format was the dominant approach, with students organized into high, middle, and low performing groups. Although subsequent national surveys of ability grouping are scarce until the John Hopkins study in the mid-1980s (mentioned above), carefully crafted studies of local practice reported similar frequencies. Eighty percent or more of elementary schools used within-class
ability groups.24

Then things changed. A mid-1990’s survey of a random sample of pre-K through fifth grade teachers reported startlingly different results. When allowed multiple responses, only 27% of teachers reported using ability grouping for reading instruction. Another 56% of teachers indicated that they used flexible grouping. Some of the teachers with flexible grouping may have utilized ability as a criterion for grouping.25 Whole class instruction was by far the most popular organizing strategy, with 68% of teachers reporting its use. Removing the overlapping responses makes it clear that ability grouping served a subordinate role as a method of organizing students. When teachers were held to one response and asked to identify their primary organizational approach, the order was: whole-class instruction (52%), flexible grouping (25%), and ability grouping (16%).

A more recent survey suggests ability grouping has regained favor among teachers. Barbara Fink Chorzempa and Steve Graham (2006) surveyed a national random sample of first through third grade teachers. Their questionnaire asked questions similar to the Baumann et al. survey of the 1990s, but also included questions about why teachers ability group. Three times as many teachers (63%) said they use ability grouping as the earlier survey. The authors explain that the discrepant findings may stem from the different grade levels of teachers in the two surveys. Pre-K and fourth- and fifth-grade teachers, who are included in the earlier
survey but not in the latter, may be less likely to employ ability grouping than first through third-grade teachers, the target population of the latter survey. Interestingly, the top reason teachers gave for using ability grouping was “that it helps them meet students’ needs;” however, respondents also expressed concern about the quality of instruction in low ability groups.26 About 20% of teachers did not ability group at all because the practice was banned by district or school policy.

Is ability grouping in decline or on the rise again? How about tracking? Let’s turn to NAEP data to shed light on these questions.

NAEP Data on Ability Grouping

Table 2-1 displays NAEP data on ability grouping in fourth grade reading. Teachers were asked on what basis they create instructional groups (ability, interest, diversity, and other) with “not created” also an option. Bear in mind that asking fourth-grade teachers about ability grouping, as compared to sampling teachers of several elementary grades, has both an upside and a downside in elucidating trends. The upside is that grade level is held constant over several surveys. This is important because we know ability grouping varies by grade level. The downside is that fourth grade isn’t where the action is on ability grouping—that’s first grade, where unfortunately NAEP does not collect data. Fourth grade is well after ability grouping’s apogee and somewhere near the midpoint of its diminishing use by elementary teachers.


Table 2-1 is revealing. The percentage of students placed into ability groups for reading instruction skyrocketed from 1998 to 2009, from 28% to 71%. And the percentage of students whose teachers did not create ability groups fell from 39% in 1998 to 8% in 2009. In other words, the odds of a fourth grader being ability grouped in reading were less than 50-50 in 1998 but by 2009 had increased to about 9 to 1. The question was not asked prior to 1998.

Table 2-2 shows the frequency of ability grouping in fourth-grade mathematics. Teachers were asked if they create math groups based on ability. This question was asked twice before 1998 and in 2011, so it gives a deeper historical perspective than the question on reading. Math ability grouping dips from 1992 to 1996 (48% to 40%), stays about the same until 2003 (42%), and then accelerates from 2003 to 2011 (reaching 61% in 2011).


The NAEP data support the general finding of a drop in ability grouping in the 1990s and a resurgence in the 2000s. The rebound is more subdued in math than in reading. It is apparent by 2000 in reading (it may have begun even before then; the data start in 1998) but does not begin in math until after 2003. In the years for which data are available for both reading and math (2000, 2003, 2007, 2009), the two subjects have comparable frequencies in 2000 (39% in reading and 41% in math), but reading is more often grouped in subsequent years. In the last year with data on both subjects, 2009, 71% of fourth grade students were ability grouped for reading and 54% for math.

NAEP Data on Tracking

Table 2-3 displays NAEP data on tracking in 8th grade. Note that unlike ability grouping, which is a classroom level practice and consequently a topic for teacher surveys, tracking is a school level practice and a topic for surveys of school principals. Although the wording of the survey item varies slightly from year to year, NAEP asks principals whether students are assigned to classes based on ability so as to create some classes that are higher in average ability or achievement than others. The question is asked sporadically and about different subjects in different years.


Math has the most data, surveyed ten times from 1990–2011. Tracking in math shows a slight dip in the 1990s and an increase in the 2000s, but most of the fluctuations are too small to consider significant. The trend is essentially flat, with about three-fourths of students attending tracked math classes over the past two decades. Typically, this means schools offer an algebra class for some eighth graders and a pre-algebra class for those who are not yet ready for formal algebra (see table 3-2 for enrollment statistics). Sometimes a third class is offered, perhaps geometry for students who took algebra in seventh grade or a basic math class for students several years behind.

Data on the other subjects are spotty. They exhibit much less tracking than math and greater variation over time. In 1990, principals reported that 60% of students were in tracked ELA classes, a statistic that declined over the next several years, hitting a low of 32% in 1998. The 43% frequency of tracking reported in 2003 is an increase from 1998; however, because it was the last time the question was asked in that subject, it is impossible to tell whether an enduring rebound in ELA tracking had begun. Science and history have even less data, with both subjects registering their highest figures in 1990 and then indicating diminished tracking after that. Science seems to show a rebound from 1994–2000. For all four subjects, the least amount of tracking occurred between 1994 and 1998, when the detracking movement was in full bloom.

The national pattern is consistent with previous studies of California and Massachusetts. In those two states, detracking was most intense in the early to mid-1990s, but differences among the subjects emerged. Mathematics resisted detracking while heterogeneously grouped classes became the norm in ELA, science, and history. In a 2009 survey of Massachusetts schools with eighth grades, for example, in math only 15.6% of schools offered heterogeneously-grouped classes; 49.2% offered classes with two ability levels; and 35.2% offered three levels. In other subjects, tracking had almost disappeared—72.7% offered only heterogeneously-grouped classes in ELA, 89.8% in history, and 86.7% in science.27


This study has explored trends in the use of ability grouping and tracking by American schools. It used NAEP data to examine the frequency that fourth graders are assigned to groups and eighth graders assigned to classes based on ability or prior achievement. The investigation focused on what schools are doing, not on whether tracking or ability grouping is a good idea.

NAEP data from 1990 to 2011 were examined. Ability grouping in fourth grade decreased in the 1990s and then increased markedly in the 2000’s, with the rebound apparent in both reading and math. In reading, ability grouping has attained a popularity unseen since the 1980s, used with over 70% of students. As for tracking, it has remained commonplace in eighth-grade mathematics for the past two decades, with about three-quarters of students enrolled in distinct ability-level math classes. Tracking in ELA declined sharply from 1990 to 1998, and although there was a rebound in 2003, NAEP has not surveyed schools on tracking in ELA since then. And NAEP data are too sparse in other subjects to determine trends.

Do these trends matter? Why should anyone care about tracking and ability grouping? Although the debate today is more subdued than in the 1980s and 1990s, it does continue. A research review on the NEA website blasts both tracking and ability grouping as discriminatory.28 Scholars continue to wrangle over the wisdom of both practices. Effectiveness and equity persist as the dominant themes of this literature. A 2010 meta-analysis of high quality studies calculated a positive effect size of 0.22, equal to about one-half year of learning, for within-class grouping in reading instruction.29 A 2010 study of data from the Early Childhood Longitudinal Study (ECLS), on the other hand, found “students who are lower grouped for reading instruction learn substantially less, and higher-grouped students learn slightly more over the first few years of school, compared to students who are in classrooms that do not practice grouping.”30 That finding is especially relevant to closing achievement gaps between students who may populate high and low groups.

The controversy offers a very important lesson about how education policy gets implemented in schools. Schools are not merely the last step of a vast organizational ladder, not simply the education system’s operational frontline, ready to put in place the policies that are passed down from above. Finley Peter Dunne famously observed that the U.S. Supreme Court “follows the election returns.” Court decisions not only reflect the U.S. Constitution but public opinion as well. Our schools are another institution with an ear to the ground. Educators are aware of public debates and are influenced when particular school practices become controversial.

Figure 2-1 shows the number of times the term “ability grouping” appeared in Education Week from 1983 to December 2012. Consider this a proxy for media visibility over the past thirty years. The 135 appearances over these three decades represent an average of 4.5 mentions per year. The peak coverage occurred in 1993, with 20 mentions. The years immediately preceding 1993 show a gradual build up in coverage, with 5 mentions in 1989, 13 in 1990, 11 in 1991, and 13 in 1992. The years immediately after 1993 show a gradual decline—8 appearances in 1994, 5 in 1995, 7 in 1996, 5 in 1997, and 7 in 1998. The ten years from 1989–1998 are the only years with more than 5 annual mentions. Tracking and ability grouping were in the spotlight.


The data on media visibility are inversely related to the data on use. At the beginning of the 1990s, tracking and ability grouping were conventional practices but then declined —albeit with some lag time—when they were subjected to the most public scrutiny. The mentions in Education Week peaked in 1993. The use of ability grouping and tracking reached all time lows soon after that event. As the controversy died down in the 2000s, schools returned to both practices.

What else may have promoted the resurgence in the 2000s? Accountability systems, bolstered by the accountability provisions of No Child Left Behind, focus educators’ attention on students below the threshold for “proficiency” on state tests. That provides a statutory justification for grouping students who are struggling. The increased use of computer instruction in elementary classrooms cannot help but make teachers more comfortable with students in the same classroom studying different materials and progressing at different rates through curriculum. The term “differential instruction,” while ambiguous in practice, might make grouping students by prior achievement or skill level an acceptable strategy for educators who recoil from the term “ability grouping.”

A substantial number of teachers believe that heterogeneous classes are difficult to teach. The 2008 MetLife Survey of the American Teacher asked teachers to react to the following statement: “My class/classes in my school have become so mixed in terms of students’ learning ability that I/teachers can’t teach them.” Responses were: 14% “agree strongly,” 29% “agree somewhat,” 28% “disagree somewhat,” and 27% “disagree strongly.”31 The percentages are surprising given the questionnaire’s blunt assertion that heterogeneous classes are impossible to teach. Moreover, the 43 percent of respondents that either agree strongly or somewhat agree with the prompt is up from 39 percent on the same survey item in 1988. Teachers’ beliefs about the impact of achievement heterogeneity on instruction undergird the use of ability grouping and tracking.

Let’s look ahead. Will the uptrend in ability grouping continue? Not necessarily. The current period may be the lull before the storm. Theoretically, at least, the Common Core establishes a curriculum that most, if not all, students will study. It is unclear how students who have already mastered the Common Core standards before beginning a particular school grade will have their needs met under the new regime. The same goes for students who lag many years behind. Tracking and ability grouping have been common approaches to addressing such challenges. These two organizational strategies affect millions of students daily. Both practices shape aspects of schooling that we know to be important—the curriculum students study, the textbooks they learn from, the teachers who teach them, the peers with whom they interact. Despite decades of vehement criticism and mountains of documents urging schools to abandon their use, tracking and ability grouping persist—and for the past decade or so, have thrived.

« Part I: The Latest TIMSS and PIRLS Scores Part III: Advanced Math in Eighth Grade »

Part II Notes

11. Tom Loveless, The Tracking and Ability Grouping Debate (Washington, DC: Thomas B. Fordham Institute, July 1, 1998). 

12. Samuel R. Lucas, Tracking Inequality: Stratification and Mobility in American High Schools (New York: Teachers College Press, 1999).

13. Even Finland and Sweden, famous for egalitarian reforms, divide students for the final two years of secondary school. Germany begins tracking at age 11.

14. Alan Smithers and Pamela Robinson, Choice and Selection in School Admissions: The Experience of Other Countries, accessed March 4, 2013,

15. Robert Dreeben and Rebecca Barr, “The Formation and Instruction of Ability Groups,” American Journal of Education 97, no. 1 (1988): 34-64.

16. See p. 36, Figure 5: James M. McPartland, J. Robert Coldiron, and Jomills H. Braddock II, School Structures and Classroom Practices in Elementary, Middle, and Secondary Schools, Report No. 14 (Baltimore: The Johns Hopkins University, 1987).

17. “Success for All—Home”, Success for All Foundation,

18. Ray C. Rist, “Student Social Class and Teacher Expectations: The Self-fulfilling Prophecy in Ghetto Education,” Harvard Educational Review 40, no. 3 (1970): 411-451. 

19. Ability grouping is called “setting” in Great Britain. Recent reports have been sharply critical of the practice, see: “Setting Harms Education of Some Young Children, Report Warns,” The Independent, May 16, 2008,

20. James E. Rosenbaum, Making Inequality; the Hidden Curriculum of High School Tracking (New York: John Wiley & Sons, 1976).

21. See: Jeannie Oakes, Keeping Track: How Schools Structure Inequality (New Haven: Yale University Press, 1985). Also see: Jeannie Oakes, Amy Stuart Well, and Associates, Beyond the Technicalities of School Reform: Policy Lessons from Detracking School (Los Angeles: UCLA Graduate School of Education & Information Studies, 1996).

22. The politics and policies of tracking reform are investigated in: Tom Loveless, The Tracking Wars: State Reform Meets School Policy (Washington: Brookings Institution Press, 1999).

23. Mary C. Austin and Coleman Morrison. The Torch Lighters: Tomorrow’s Teachers of Reading (Cambridge: Harvard University Graduate School of Education, 1961).

24. Rebecca Barr and Robert Dreeben, How Schools Work (Chicago, University of Chicago Press, 1983).

25. ECLS asked kindergarten teachers in 1999 the frequency with which they used ability groups in reading. Five response categories, ranging from 0 (never) to 4 (daily). 30% reported never using ability grouping. The average for all teachers was 1.64, indicating about once a week (1 = less than once a week; 2 = once or twice weekly). When the ECLS sample was in 3rd grade, 2001–2002, 50% of teachers employed ability grouping in reading, consistent with the NAEP figure for 4th grade in 2003 (47%). See p. 301, note 6 in Christy Lleras, and Claudia Rangel, “Ability grouping practices in elementary school and African American/Hispanic achievement.” American Journal of Education 115, no. 2 (2009): 279–304.

26. Barbara Fink Chorzempa and Steve Graham, “Primary-Grade Teachers’ Use of Within-Class Ability Grouping in Reading,” Journal of Educational Psychology 98, no. 3 (2006): 529-541.

27. Tom Loveless, Tracking, Detracking: High Achievers in Massachusetts Middle School (Washington, DC: Thomas B. Fordham Institute, 2009).

28. “Research Spotlight on Academic Ability Grouping,” NEA,

29. Kelly Puzio and Glenn Colby, The Effects of Within Class Grouping on Reading Achievement: A Meta-Analytic Synthesis (Evanston: Society for Research on Educational Effectiveness, 2010).

30. Christy Lleras and Claudia Rangel, “Ability Grouping Practices in Elementary School and African American /Hispanic Achievement,” American Journal of Education 115, no. 2 (2009): 279.

31. Dana Markow and Michelle Cooper, The Metlife Survey of the American Teacher: Past, Present and Future (New York: Metlife, 2008).