Tools to Avoid Disclosing Information About Individuals in Public Use Microdata Files

Cynthia Taeuber

Executive Summary

Statistical agencies walk a fine line meeting their legal obligation to avoid disclosing information about an individual while still providing useful data for research on complex policy questions.

To meet the needs of data users (researchers and policy makers), statistical agencies have long provided public use microdata (PUMS) files for research that also have a low risk of re-identification of individuals. Two technological advances of recent years make it easier to create a “mosaic” of data sets that increase the chances of identifying an individual and make it more complex for statistical agencies to meet their statutory mandate to keep private information private:

  • A great deal of information is available about individuals on the Internet.
  • Sophisticated software has been developed that allows linkage of records that can identify a small percentage of individuals that make up traditional public use microdata (PUMS) files.

To protect confidentiality under these circumstances, national statistical agencies invest heavily in statistical techniques, software, and policies that safeguard the confidentiality of the data they release for public use.

This paper examines the risks and effectiveness of traditional techniques for protecting data confidentiality and privacy of individuals. It recommends that statistical agencies invest more than they have already to find alternatives to enhance data quality and lower the risks of re-identification of individuals. Many techniques developed for federal uses are applicable to community statistical systems-data sources at the state, regional, and local level.

Towards a National Infrastructure for Community Statistics

This publication is one in a series that investigate the design, development, and implementation of National Infrastructure for Community Statistics (NICS), a proposed nationwide web-based utility that facilitates access by public and private decision-makers to detailed, current community-level statistics from thousands of local, state, federal, and commercial data sources. To find out more information about the NICS effort go to