Evidence-based retirement policy: Necessity and opportunity

retired couple sits on the beach

Retirement saving plays an important role in the U.S. economy. Americans hold more than $18 trillion in private retirement accounts like 401(k)s and IRAs, while defined benefit pensions in the private and public sector hold trillions more. Social Security and Medicare comprise nearly 40 percent of the federal budget. The government also provides tax subsidies for retirement saving, and funds Medicaid, which covers elder long-term care. Retirement issues will only become more important in the future, as the population ages, the Baby Boom retires, lifespans increase, and health care costs rise.

Yet despite existing research, policymakers do not have access to robust empirical consensus when making decisions that affect the retirement security of tens of millions of families. There are many major outstanding questions:

  • How well are households preparing for retirement? Are Americans failing to accumulate enough wealth to support themselves in retirement, or are retirement saving shortfalls small and declining?
  • Why does consumption fall at retirement? Does this indicate that retirees aren’t saving enough, or does it reflect their ability to secure the same quality of life with fewer expenditures?
  • Do tax-based saving incentives raise net wealth accumulation? Do the effects vary by saver characteristics and plan design? To what extent is the substantial flow of contributions into 401(k)s and IRAs a net addition to saving, as opposed to saving that would have been done anyway, in other forms?
  • What policies boost saving? Would improving financial literacy or mandating saving be effective? Are there retirement saving programs that raise participation and increase overall wealth accumulation?
  • Why do households consistently make retirement financing decisions that do not appear to be in their own best interest? For example, why do households buy fewer annuities and reverse mortgages than expected? Why don’t more retirees boost expected lifetime benefits by waiting longer to claim Social Security? Are these choices rationally dictated by either unobserved preferences or imperfect markets? Or are they irrational responses that are better explained by a variety of decision-making biases?

Obtaining better answers to these questions and using the insights they provide to guide changes in the American retirement system could improve living standards for generations of retirees and control the federal budget. But to obtain these answers, researchers and policy makers need better information. There are several major sources of data on U.S. households’ saving and wealth, including several large-scale surveys administered by academic or public institutions. These data sets have proven useful for examining many questions but are not comprehensive or extensive enough to generate evidence that can conclusively address the major outstanding questions in retirement policy.

To generate compelling results, researchers need more than access to more comprehensive data— they must also employ better study designs. The research design most conducive to drawing causal inference is the randomized control trial (RCT), where subjects are randomly assigned to treatment and control groups. For example, one of us (Gale) has used an RCT to study the homeownership outcomes of those who used Individual Development Accounts (IDAs). The five-year analysis and the ten-year follow-up found that IDAs accelerate home buying but do not materially affect long-term rates of homeownership—a crucial assessment that helped shift advocates’ efforts to help low-income households to other strategies.

But RCTs are not always easy to implement, and researchers often focus instead on “quasi-experimental” research designs or “natural experiments” induced by policy changes or other exogenous events. These research strategies certainly have advantages, but at times the results are hard to interpret because it is unclear what would have happened in the absence of the policy. In any case, the results of well-designed studies are harder to ignore. They can have greater impact on policymakers’ decisions because, unencumbered by contorted methodology, the findings inspire greater confidence and understanding in researchers and lawmakers alike.

In the absence of such robust studies, it hardly surprising that almost no retirement policymaking is rooted in evidence; programs simply continue indefinitely with little or no Congressional oversight. Federal tax expenditures for retirement saving—which totaled $252 billion in 2018—have never been formally evaluated, while the Social Security Administration devotes less than 1 percent of its administrative budget to research and evaluation. Because public institutions do not formally evaluate their own programs, academic and think-tank economists provide most of the existing analyses, which are limited by the accessible data, as noted above. Using federal dollars to most efficiently improve retirement security requires building consensus around what makes effective policy—an impossible task without robust and transparent research methods, empirical replication, and the comprehensive data these processes rely upon.

Strengthening the link between expert consensus and political action could create greater demand for this invaluable data. The federal government has already taken important steps towards implementing a more evidence-based policymaking infrastructure through the Foundations for Evidence-Based Policymaking Act, passed earlier this year, which creates new government entities devoted to data sharing and evaluation, while also dramatically increasing researchers’ access to administrative data. In addition, in recent years policymakers have adopted some evidence-based policy mechanisms, like Pay for Success and tiered grantmaking, designed to funnel federal dollars to programs with demonstrated effectiveness. These initiatives present opportunities to learn about and improve existing retirement programs, but they also generate a host of new issues, like who decides what constitutes good evidence. But despite these programs, the vast bulk of federal dollars and tax expenditures are awarded without an explicit connection to evidence.


The authors did not receive financial support from any firm or person for this article or from any firm or person with a financial or political interest in this article. None of the authors is currently an officer, director, or board member of any organization with a financial or political interest in this article. The authors are not currently an officer, director, or board member of any organization with a financial or political interest in this article.


  • Footnotes
    1. Gale (2019), Lee (2014), Poterba (2014).
    2. Biggs (2019), Munnell, Hou, and Sanzenbacher (2018), Rhee (2013), Scholz, Seshadri, and Khitatrakun (2006).
    3. Banks, Blundell, and Tanner (1998), Bernheim, Skinner, and Weinberg (1997).
    4. Chetty et al. (2014), Engen and Gale (2000), Engen, Gale, and Scholz (1994, 1996), Poterba, Venti, and Wise (1995, 1996).
    5. Chetty et al. (2014), Lusardi and Mitchell (2007).
    6. Chetty et al. (2014), Madrian and Shea (2001), Thaler and Benartzi (2004).
    7. Baily, Harris, and Wang (2019), Moulton and Haurin (2019).
    8. Brown, Kapteyn, and Mitchell (2016).
    9. The 2019 Nobel Prize in Economics was awarded to MIT professors Abhijit Banerjee and Esther Duflo and Harvard professor Michael Kremer for their work in advancing RCTs in developing countries.
    10. IDAs are specialized savings accounts designed to subsidize certain behavior, in this case purchasing a home. See Mills et al. (2009) and Grinstein-Weiss et al. (2013) for further analysis.
    11. Joint Committee on Taxation (2018).
    12. For more details on the legislation, see Results for America (2019b).
    13. Results for America (2015).