Around the halls: Brookings scholars discuss a much-needed revision to America’s diversity data

WASHINGTON DC - CIRCA APRIL 2015: Diverse, multicultural college students have a conversation outside the White House in Washington DC.
Photo credit: stock_photo_world / Shutterstock

Twenty-seven years since the last revision in 1997, the White House Office of Management and Budget (OMB) has altered its Statistical Policy Directive No. 15 (SPD 15)—providing a much-needed update in the standards for defining race and ethnic categories for use in data collection across all government agencies.

The revised standards are not perfect, and OMB promises to continue monitoring their use. But they are a marked improvement in light of ongoing shifts in U.S. racial and ethnic demographics, because they will allow more flexibility for individuals in identifying how they see themselves.

The new changes will affect the “minimal categories” for data collection—the race and ethnicity data that every federal survey will be required to ask about. In addition to the five “race” categories that have been used for decades (reflecting persons identifying as White, Black or African American, Asian, Native Hawaiian or Pacific Islander, or American Indian or Alaska Native), the new minimal categories will also include “Hispanic or Latino” and “Middle Eastern or North African” (or MENA). These two new categories are not traditional racial categories; therefore, this classification will be noted as one of “race and/or ethnicity.”

The addition of the Hispanic or Latino category results from the elimination of the separate “Hispanic/Non Hispanic” question which was previously asked in addition the race question. Research from the Census Bureau and others showed that when posed with separate ethnicity and race questions, a large share of Hispanic or Latino respondents did not identify with traditional racial categories and that a “combined” race/ethnic question yielded far more valid results. In this new classification, Hispanic or Latino respondents, like those of other groups, can choose to identify with other racial categories as well, but will not be required to.

The addition of the MENA category for people of Middle Eastern and North African descent stems from years of lobbying from MENA-related communities. Prior to the new standards, MENA was often offered as subcategory of “White.”

Perhaps most importantly, the new standards dictate that as a default, government agencies will collect data on detailed race and ethnic groups within each of the seven minimal race and ethnic categories. And where possible, agencies will provide “check boxes” for the five largest detailed groups (based on the 2020 census) as well as an open-ended “write in” box to capture other groups. (One exception is for the American Indian or Alaska Native category, for which only a write-in option is required.) Moreover, the instructions on questionnaires that allow both minimal and detailed categories encourage respondents to select “all that apply,” thus facilitating the collection of multiracial and multiethnic categories.

Clearly, the expansion of the data that government agencies collect will provide far greater opportunities to examine racial and ethnic disparities on measures of economic well-being, health, education, and more from variety of sources. With that said, OMB is aware of the challenges involved with their implementation—it provides guidelines for their presentation and promises to maintain a standing Interagency Committee to carry out continuing research and review of SPD 15. Now, it is up to policymakers, scholars, and practitioners to communicate their experiences with the new standards to the wider network of stakeholders and agencies in order to take advantage of this long overdue effort to improve the nation’s diversity data.

William H. Frey

In this Around the Halls piece, scholars from across Brookings give their early thoughts on these changes and their potential impacts.

The new data standards will help ensure government works for all Americans

Chiraag Bains 

Americans deserve federal policy that works for all of us. From housing to health care, good jobs to quality schools, transportation to environmental protection, public officials must ensure that everyone has access to opportunity and no one is disadvantaged on account of their background. That idea is at the core of President Joe Biden’s equity agenda, which I had the privilege of driving forward as deputy director of the White House Domestic Policy Council.

Serving all populations requires collecting better data. Without accurate data, we can’t assess whether government programs are fair and equally accessible, or if they discriminate against or underserve Americans based on race, ethnicity, or other characteristics.

OMB’s revisions to SPD 15 mark a significant step forward in this effort. The use of a single combined question asking about “race and/or ethnicity” and encouraging people to “select all that apply” will produce more reliable data in our increasingly diverse and multiracial country. The addition of “Middle Eastern or North African” as a minimum required category—previously nested under “White”—better accords with how those groups see themselves. The elimination of “Negro,” “Far East,” and other outdated or offensive terms was long overdue.

Perhaps the most important change is OMB’s requirement that agencies collect detailed, disaggregated data as a default. While broad racial and ethnic categories are valuable, they can mask differences within diverse groups. For example, aggregate data about Asian Americans may obscure the disproportionately high rates of poverty among Hmong Americans or liver cancer for Laotian women. Now, all agencies must include check boxes for the largest subgroups within a category (e.g., African American, Jamaican, Haitian, and others under “Black or African American”) and write-in fields. They must obtain permission from Office of Information and Regulatory Affairs to seek more limited information.

Critically, the new standards are based in evidence and extensive engagement. OMB reviewed 20,000 public comments and held 94 listening sessions, three virtual townhalls, and a tribal consultation. It also drew from empirical research conducted by the Census Bureau and others. In 2015, the Bureau tested different question formats and terminology with a representative sample of 1.2 million households and 75,000 follow-up interviews. It found that a combined race/ethnicity question more accurately reflected how Latinos identify in real life: Over 70% preferred to check only “Hispanic or Latino” rather than also select a separate race. Further research will be important to build on OMB’s evidence-based decisions, including on how to encourage people to select multiple categories when appropriate.

OMB’s changes will enable better policymaking. More accurate, granular data can help law enforcement target their resources to combat hate crimes, direct agencies’ outreach toward groups eligible for but unenrolled in public benefits, and illuminate the need to translate government materials into additional languages. The updates to SPD 15 will also facilitate civil rights enforcement, as agencies and private litigants depend on federal datasets to prove and remedy discrimination in redistricting, zoning, lending, employment, and other contexts. In short, the new standards will support more just, inclusive, and responsive government action.

Individuals of MENA descent will finally see themselves reflected in policy

Andre M. Perry 

Most public policy researchers will welcome OMB’s changes in the standards for defining race mainly because the previous composition didn’t make much sense. For example, people of MENA descent were previously categorized as “White,” yet more often than not, these individuals do not see themselves as such, and white people make distinctions from them. Moreover, due to their unique historical and cultural heritage, people who identify as MENA often do not identify as Black as well. All of this is complicated by the fact that most enslaved individuals in the Americas came from West Africa, not from the Middle East or Northern Africa.

Expanding the number of racial groups to reflect societal dynamics will allow researchers to better document the impacts of policies on people who “share” a common history and/or lived experiences. MENA groups represent a wide range of religious and ethnic backgrounds, from Persians to Israelis. However, reports of discrimination are shared across many of these groups, particularly among those who identify as Muslim or speak Arabic.

This comes with the understanding that race is fundamentally a social construct, with limited to no functional basis in biology. Meaning, societies assign value to the racial identities they create or endorse, shaping how policy views and treats groups and individuals. In the case of people who identify as MENA, it’s less clear how policy supposedly privileges and/or prejudices them because we must impute MENA identities from various sources instead of using a direct, self-identified data source.

In order to tease out different impacts of policy, particularly discriminatory effects, researchers need more accurate categories to see how the distribution of resources and effects plays out. Previously, the experiences of Middle Easterners and Northern Africans have been masked by ill-fitting racial categories. The new categories will enable groups that have been hidden by those racial categories to see themselves in policy, strengthening analyses based on race.

American Indians and Alaska Natives remain misunderstood in federal race and ethnicity data, but new data presentation approaches can help

Robert Maxim 

As I wrote in an op-ed for The Hill, while OMB’s new standards for race and ethnicity are a positive development for Hispanic or Latino and MENA individuals, they do not solve many of the most pressing data challenges facing American Indians and Alaska Natives.

The federal government’s concept of “race” is a particularly ill-fitting paradigm for Indigenous people. As a result of centuries of colonization as well as other complex factors around American Indian and Alaska Native identity, these individuals are classified as multiracial at a higher rate than any other OMB-defined racial group. Indeed, across the country, just 23% of American Indians and Alaska Natives identify as that race alone and non-Hispanic.

Because of this, my colleagues and I have previously proposed that the U.S. federal government should follow the example of Canada, Australia, and New Zealand and pose a separate question asking about Indigenous identity or descent, in addition to the existing race and ethnicity questions. Instead, OMB’s new standards will mean that most American Indians and Alaska Natives will continue to be classified under the catch-all “two-or-more races” category.

Despite this, OMB’s new guidance could still help improve data around American Indians and Alaska Natives. Buried in the guidance is a set of new recommendations for the Presentation of Data on Race and Ethnicity by federal agencies. In particular, OMB recommends two approaches that are currently relatively rare for presenting race and ethnicity data. “Approach 1” allows federal agencies to report race and ethnicity data for all individuals identifying as a racial or ethnic group, alone or in combination. “Approach 2” allows federal agencies to report as many detailed race and ethnicity combinations as possible. These new approaches for presenting race and ethnicity data could pull millions of American Indians and Alaska Natives out of the “two-or-more races” classification and ensure they are fully visible in federal data.

Federal agencies should embrace these new approaches. While they won’t solve all the data problems that Indian Country is contending with, they nonetheless have the potential both to improve the quality of research around American Indians and Alaska Natives, and more fully illustrate the diversity of Indigenous people in the U.S.

Data is representation, and the OMB guidelines are beginning to put that into practice

Manann Donoghoe 

Data points represent people and communities. And in a world increasingly influenced by data, it’s more important than ever to ensure that the ways we represent diverse groups within public data are effective and equitable. OMB’s changes are moving in the right direction.

In an increasingly diverse America, there are symbolic and psychological gains from enabling more people not to have to tick the “other” box. Not everybody sees themselves represented the way that I do on surveys and censuses. My race and ethnicity category—white, Irish descent—is a baseline category on almost every form I’ve filled out. For me, completing a survey does not provoke reflections on identity or belonging, and that’s a privilege easily taken for granted. For the large number of Americans who don’t identify with the previous racial categories, including many Latinos and individuals from the Middle East and North Africa, these changes will likely be welcomed.

There are also policy gains from more inclusive federal data. When groups are excluded from data, they can be invisibilized—omitted from sampling in survey design, and thus excluded from research processes and policy design. This regularly happens with monolithic categories such as “Latino,” which spans ethnic groups across South and Central America with markedly different histories, experiences, and challenges within the U.S.

In my area of expertise (climate and environmental justice), having an accurate and detailed understanding of race and ethnicity is increasingly crucial for designing policy that can direct resources to underrepresented groups. In the U.S., environmental injustices are highly correlated with race and ethnicity. Having a more comprehensive understanding of demographic data, especially at the local level, can help policymakers identify overburdened populations and ensure that our policies and decisionmaking processes accurately respond to their unique vulnerabilities, concerns, and perspectives. OMB’s changes will help researchers like myself, civic organizations, and policymakers better understand environmental injustices and their solutions for a greater diversity of Americans.

Adding granularity to surveys may introduce data complications

Carol Graham 

While I am very supportive of the effort to increase the representation of different racial and cultural groups in OMB’s surveys, I also want to raise a note of caution about doing so. In my long experience with survey design and research (most recently in the field of well-being economics), I have found that there is always a trade-off between adding complexity and granularity to surveys and the logistics of administering them and using the results for comparisons across places and cohorts and over time. In terms of administration—i.e., fielding the surveys—too much length and complexity tend to reduce response rates, as the longer and more complex they are, the less likely people will be willing to respond. In terms of comparisons, it is key to keep a core group of measures consistent over time—a core that can be supplemented by new measures and metrics, but should not be replaced unless absolutely necessary.

The risk is that the new data will not be comparable across cohorts and over time, which would result in the loss of valuable information and lessons. The addition of Hispanic and Latino as an ethnicity in addition to a race, for example, has created confusion and a lack of comparability in many instances (accepting that there are also obvious benefits).

Thus, any discussion of how many and which new indicators to add to the OMB guidelines should take those tradeoffs into account. In my most recent experience in trying to design and include measures of community well-being in addition to individual well-being in the guidelines for statistical offices around the world, there is no magic bullet. But there is some consensus on the need to retain a core group of the original measures in surveys, as well as the option to include additional metrics in modules designed to achieve the new objectives. At the least, this is one way to think about proceeding, and also allows for the ability to test the robustness and reliability of the new metrics.