There are many Americans who would benefit from a postsecondary education but who never attend college, or who start college but don’t earn a degree. Many come from low-income families.1 Addressing gaps in educational attainment by family income, which exist even among similarly prepared students, is one of the most significant challenges facing policymakers concerned about income inequality and socioeconomic mobility.
It’s important that we closely track progress – or regress – in closing these gaps. Are we closing the gap in educational attainment between those who grew up in rich and poor families? How much progress did we make this year? Last year? Over the last five years? The currently available data are simply not up to answering these questions. It should be a priority for statistical agencies, particularly the Department of Education, to focus on this critical gap in our knowledge.
In this essay, we describe the data gaps in tracking income differences in educational attainment and the pitfalls they create for analysts. We also propose how to fill these gaps in our knowledge.
First, what we do well. We can track educational attainment of the population on an annual basis: the Current Population Survey shows, for example, that the bachelor’s degree attainment rate among 24-year-olds rose to 30 percent in 2014, its highest level ever, although it had hovered between 27 and 30 percent in every year since 2008.2
Now, what we do poorly. We can’t say how these statistics, and particularly their annual values, differ by parental income. To calculate such statistics, we need data that links young adults (and their educational attainment) to their parents. The federal government releases such data only every ten years – far too infrequent to guide policy and gauge progress.3 The most recent federal data that lets us calculate income gaps in educational attainment is for the high school class of 2004, most of whom turned 26 in 2012. At that point, the gap between the top and bottom income quartiles in college graduation was 37 percentage points.4 We are now three years behind in knowing whether we are making progress (or regress) in closing this gap, and it will be several more years before updated data are available.
Attempts to produce annual estimates with the wrong data can produce wildly misleading results. A recent publication from the Pell Institute for the Study of Opportunity in Higher Education reported that 99 percent of college students who grew up in the top income quartile went on to complete their BA, compared to 21 percent in the poorest quartile.5 These findings received prominent attention from the news media and among policy analysts. Libby Nelson of Vox declared herself “gobsmacked” by the attainment data, and Richard Reeves of Brookings had a similar reaction: “Wow. I mean, WOW.”
To put it bluntly, these statistics are wrong. They are based on data from the monthly Current Population Survey (CPS). It would be wonderful if the CPS provided the income of young adults’ parents, but as a rule it does not. Parental information is present in the CPS only if the young adult is currently living at home, or is temporarily away from home. The primary reason a young adult is temporarily away from home is to attend college. Once a child forms her own household, she disappears from her parents’ CPS record.
If we limit our analysis of the educational attainment of young adults to those who are linked to their parents’ CPS records, we have committed the mortal sin of selection bias: selecting the sample on the variable of interest.6
The figure below shows that, in 2013, we have data on parental income for only 35 percent of 24-year-olds, the age group typically examined to track trends in bachelor’s degree attainment. When this same cohort was 16, in 2005, the vast majority (92 percent) were linked to their parents’ records. As the cohort aged, the share missing parental information steadily increased, becoming a majority by age 22.
Source: Authors’ calculations using October CPS data, 2005 (age 16) through 2013 (age 24)
Notes: Living in parents’ household defined as respondents of listed age identified as living in parent’s, grandparent’s, or foster parent’s household.
This shrinking sample would not be a problem if children randomly exited their parents’ households. If that were the case, the randomly-selected 35 percent of 24-year-olds for whom we have parents’ income could still inform us about gaps in BA attainment. But, as noted above, college enrollment determines whether a child is attached to her to parents’ household record in the CPS.
Since there are income gaps in college enrollment, there are also income gaps in the share of adult children for whom we have parental income in the CPS. The figure below shows that high-income families are overrepresented in the CPS data as their children age.7 Fifty percent of 24-year-olds from high-income families are attached to their parents’ household, compared to only 28-37 percent of children from all other families.
Source: Authors’ calculations using March CPS data, 2005 (age 16) through 2013 (age 24)
Notes: Percent remaining in parents’ household defined as number respondents of listed age identified as living in parent’s, grandparent’s, or foster parent’s household divided by number of respondents identified as such at age 16. Income quartiles are based on households with children between the ages of 14 and 16 in each year.
Tom Mortenson was kind enough to share with us the methodology he used in generating the statistics for the Pell Institute report. He explained that the CPS data in the report were statistically adjusted to account for the exit of adult children from their parents’ households. The statistical adjustments were made using data from a federal study of the high school class of 1982, High School and Beyond (HSB). The HSB data link parents and their adult children. For the relevant cohort, which was 24 in 1988, this statistical adjustment (by construction) produces fairly accurate CPS estimates of income differences in college attendance and completion.
But a lot has changed since 1988 and, as a result, this decades-old adjustment no longer produces correct CPS estimates. In fact, in some years, this adjustment has indicated that more than 110 percent of some groups had completed college – a red flag that this approach has gone off the rails.
We can see that the CPS estimates are incorrect by comparing them to those produced by the (infrequent) surveys that consistently link adults to data on their parents’ income. One is the National Longitudinal Survey of Youth (NLSY). In the NLSY, among those who turned 24 between 2005 and 2008, 54 percent of the top income quartile completed a BA. The estimate is similar (54-59 percent) in the Panel Study of Income Dynamics8 and in Department of Education data recently discussed by Sandy Baum.
These three longitudinal surveys provide a consistent picture for cohorts turning 24 between 2005 and 2008: 54-59 percent of those raised in the top income quartile earned a BA. The Pell Institute estimate for this group is 75-80 percent, which is off by a full twenty percentage points.9
The report’s estimates of the graduation rate of college entrants from the top income quartile are also wrong. From the NLSY, we know that for recent cohorts about 65 percent of those raised in the top income quartile earned a BA. The equivalent Pell Institute estimate is 99 percent – the eye-catching, but incorrect statistic that hit the headlines.
We are not cherry-picking these numbers. If we look to other cohorts, we see similar, systematic errors in the Pell Institute’s CPS estimates.
We share the Pell Institute’s deep concern with growing income gaps in children’s educational attainment. That’s why we think it’s extremely important to make sure we get the data on these gaps correct, so we can know if they are getting worse or better and then act on our knowledge.
Margaret Cahalan, director of the Pell Institute, indicates they will be reassessing the data used for future reports: “Our plans for the 2016 Indicators Report will include looking at trends in so far as possible from the NCES high school longitudinal studies as well as more explanation as to the limitations of the CPS data on educational attainment.”
We also shared our analysis and policy conclusions with Tom Mortenson of the Pell Institute, who had this to say: “We need better data! I support a national unit record data system, repeal of the student privacy legislation for research purposes, and an annual or perhaps biennial federal longitudinal study that follows up for ten years after high school graduation.”
We agree. At present, there is no data source that can be used to credibly and consistently measure these gaps on an annual basis. The existing longitudinal databases are administered too infrequently (NLSY and various Department of Education datasets) or have samples too small to reliably measure annual values (PSID).
The federal government could solve this problem at low cost by supplementing surveys with administrative data. The CPS records of young adults could be linked to administrative data on their parents’ income held by the Internal Revenue Service or Social Security Administration. Or, the IRS could release a data series on college attendance by family income, since they have collected data on college attendance (though not graduation) since the late 1990s. This information could be calculated for the nation as a whole as well as for individual states. Finally, NCES could conduct its longitudinal studies more often but with less voluminous surveys.
We end with a caution for consumers of data and statistics. If a statistic seems wildly wrong, it probably is. Most of us probably know enough college dropouts from the top income quartile to know that a 99 percent college completion rate can’t possibly be right.
2 Authors’ calculations using March CPS. We thank Katharine Lindquist for exceptional research assistance with the analysis of the CPS data.
3 The Panel Study of Income Dynamics is administered yearly, but the samples for the relevant age groups are too small to produce precise estimates of postsecondary outcomes by family income on an annual basis.
4 Authors’ calculations from the ELS:2002 indicates that 17 percent of students from the bottom income quartile attained a bachelor’s degree or higher by 2012, as compared to 54 percent of students from the top income quartile. Given the use of binned incomes in the ELS survey question, we classify 21 percent of students in the bottom quartile and 26 percent in the top quartile.
5 From the report: “In 2013 the top quartile approached universal completion of a bachelor’s degree among those who entered college,” referring to figure on p. 33, which plots this series over time.
7 As noted above, CPS respondents cannot be tracked over time. Instead, we track cohorts from one year to the next (and fix the income quartile cutoffs using the distribution of households with children age 14-16), ignoring any impacts of immigration and emigration. This analysis uses the March CPS data because income is measured continuously, whereas the income question in the October CPS is categorical and cannot be divided into equal-sized quartiles. March is also preferable to October for our purposes because March has a higher response rate to the income question.
8 Duncan, Kalil, and Ziol-Guest (2015).
9 From the report: “In 2013 individuals from the highest-income families were 8 times more likely than individuals from low-income families to obtain a bachelor’s degree by age 24 (77 percent vs. 9 percent),” referring to figure on p. 31.