EITC Interactive: User Guide and Data Dictionary
This guide answers general questions about the operation of the interactive website and the information it provides, as well as specific questions users might ask about how to use some of these data.
For more information about the database and using the data, contact Jane R. Williams or Elizabeth Kneebone.
The latest version of the Brookings EITC Interactive website contains data for tax years 1997 through 2010. Tax return data are summarized to different geographic areas including ZIP code, city, county, metropolitan area, state, state legislative district, and congressional district. All data are derived from the Internal Revenue Service's Stakeholder Partnerships, Education, and Communication (IRS-SPEC) Return Information Databases, compiled by the IRS Wage and Investment Research Unit. This version of the site supersedes all previous versions.
What's new in this installment of the website?
The latest version of the EITC Interactive add Tax Years 2009 and 2010, the most recent years for which detailed SPEC data are available.
Significant changes have been made in the 2009 and 2010 data that can affect comparisons over time:
- Starting in Tax Year 2009, EITC Interactive data include only tax returns filed between January and June. The shift from full-year to part-year data reflects changes in SPEC reporting practices. Part-year estimates generally account for more than 90 percent of tax returns filed during the year. Use caution when making comparisons over time: compared to full-year data, part-year estimates will undercount raw numbers somewhat and may produce slightly different outcomes when used to calculate percentages.
- For Tax Years 2009 and 2010, geographic boundaries have been updated to reflect changes based on the 2010 Census. State legislative and congressional district data for 2009 and 2010 reflect recent redistricting based on the 2010 decennial census. (Earlier years of data based on previous boundaries are no longer provided through EITC Interactive, but are available upon request.) Place estimates also reflect current boundaries as of 2010. Use caution when comparing data over time for cities or towns that have changed boundaries in recent years (e.g., through annexation or consolidation).
- Two variables in the SPEC database changed in Tax Year 2010. The IRS changed the definition of "new" tax filers in the most recent data to reflect only those who have never before filed a return. It is no longer directly comparable to data in prior years, which defined "new" filers as those who did not file in the previous year but may have filed in the past. In addition, IRS removed the debt indicator in Tax Year 2010. As a result, users will note a steep drop in the number of filers requesting Refund Anticipation Loans. See the data dictionary below for more information.
Please refer to this brief for more detail and guidance on how to work with the new data.
How can I access tax return data?
The site first prompts users to choose the geography and year for which they want to view tax return information. Regardless of the type of geography chosen, the site will provide data at the ZIP code level; the site will also provide a total row at the bottom of each report that sums all ZIP codes within the selected geography.
After selecting a geography and year, the site then prompts the user to select the desired return type—either "All Tax Returns" or "EITC Returns Only"—and, in years 1999 through 2010, the variable categories. The user must then request the data in either HTML or MS Excel format.
What do the different geography types mean?
EITC Interactive allows users to request data for a variety of different communities. ZIP code-level IRS data are aggregated to create larger geographic units, including cities, counties, metro areas, states, state legislative districts, and congressional districts. Each geography type is briefly described below.
ZIP Codes are designated by the U.S. Postal Service and are the smallest areas for which the IRS reports tax return data. ZIP code boundaries (and the total number of ZIP codes) can change from year to year based on local population and business dynamics. ZIP codes provide the building blocks for all other geography types on the EITC Interactive site.
Places (Cities/Towns) include incorporated places, such as cities, towns, and villages, as well as census-designated places, which are unincorporated areas delineated by the U.S. Census Bureau for statistical purposes.See this brief for information on geographic changes between pre- and post-2009 place-level data.
Counties represent the primary legal subdivision of most states. Exceptions include Alaska and Louisiana which are divided into boroughs and parishes, respectively. Data for all U.S. counties, boroughs, and parishes are available on the EITC Interactive site under the "county" geography type.
Metropolitan Areas are composed of counties. Metro names in this version of the site are based on the 2003 metropolitan statistical area definitions issued by the U.S. Office of Management and Budget.
State reports, in addition to state totals, provide data on all ZIP codes in the state including their associated cities, counties, and metro areas.
States Legislative Districts reflect the newly redistricted boundaries, in effect as of the 2012 elections, and are available for Tax Years 2009 and 2010 via EITC Interactive*.
EITC Interactive includes data for lower chamber (generally House) and upper chamber (generally Senate) districts in all 50 states and the District of Columbia. Note that Nebraska has a unicameral system. Data for Nebraska can be accessed through either the "lower chamber" or "upper chamber" geography type.
Congressional Districts represent the 113th United States Congress, the first Congress elected from congressional districts that were apportioned based on the 2010 census. These estimates are available for Tax Years 2009 and 2010 via EITC Interactive*.
*Users interested in pre-2009 district data based on the previous boundaries can contact Jane R. Williams for access to these files. Due to ongoing litigation, Rhode Island could not provide updated state legislative and congressional boundaries by the time of publication. Once the files are released, we will make those estimates available. See this brief for more information on the geographic changes between pre- and post-2009 district data.
What is the "return type?"
The user must select data relating either to All Tax Returns or EITC Returns Only. Data on All Tax Returns reflect the characteristics of all individual income tax returns filed in the selected geography. Data on EITC Returns reflect the characteristics of only those returns receiving the EITC in the selected geography. For instance, if the user wants to know how many tax filers overall received a refund, or used a paid preparer, he/she would select "All Tax Returns." If a user was instead interested in the total amount of refunds received by filers receiving the EITC, or how many EITC recipients received the Additional Child Tax Credit, he/she would select "EITC Returns."
What do the variable categories mean?
The website allows users to select from among several categories of variables to display for tax years 1999 through 2010 (the expanded data are not available for tax years 1997 and 1998). If the user is interested in the full complement of data in any given year, he/she can leave the boxes checked, and all available variables will be returned. If the user is interested only in select variables in any given year (e.g., only information on the number of EITC filers and EITC refunds received), he/she can de-select variable categories to exclude them from the data returned. Users can position the cursor over any of the variable category names to obtain further information about the data associated with each.
What do the column headers in the data represent?
The website returns data to the user with a series of abbreviated column header names.
Each column header contains a prefix and a suffix. The prefix refers to the return type selected:
- "t" refers to "All Tax Returns" (t=total)
- "e" refers to "EITC Returns Only" (e=EITC)
The suffix refers to the tax year of data selected:
- 97 - Tax year 1997
- 98 - Tax year 1998
- 99 - Tax year 1999
- 00 - Tax year 2000
- 01 - Tax year 2001
- 02 - Tax year 2002
- 03 - Tax year 2003
- 04 - Tax year 2004
- 05 - Tax year 2005
- 06 - Tax year 2006
- 07 - Tax year 2007
- 08 - Tax year 2008
- 09 - Tax year 2009
- 10 - Tax year 2010
The root of the variable name refers to the descriptions below.
|Variable name (root)
||Total number of returns
||Total number of returns where the taxpayer never previously filed a tax return
Note: Prior to Tax Year 2010, this variable represented the number of filers who did not file a tax return in the previous year but may have in earlier years.
||Total number of returns receiving the Earned Income Tax Credit (EITC)
||Sum of EITC received
||Total number of returns receiving the Child Tax Credit
||Sum of Child Tax Credit received
||Total number of returns receiving the refundable portion of the Child Tax Credit
||Sum of the refundable Child Tax Credit received
||Total number of returns filing Form 2441 (Child and Dependent Care Expenses)
||Total number of returns filing Form 8863 (Education Credits)
||Total number of returns receiving a deduction for payment of Student Loan Interest
Note: This is distinct from the tuition and fees deduction and any education credits.
||Total number of returns receiving a refund
||Sum of refunds received
||Total number of returns with a balance due after remittance
||Sum of balance due after remittance
||Total number of returns receiving direct deposit of refund
Note: Refund anticipation products are counted in this variable because they direct refunds to temporary bank accounts through direct deposit.
||Total number of returns requesting a Refund Anticipation Loan (RAL)
Note: Beginning in Tax Year 2010, IRS no longer provides a "debt indicator," an indication of whether the taxpayer has outstanding debt. As a result, the number returns requesting a RAL dramatically decreased.
||Total number of returns requesting a Refund Anticipation Check (RAC)
||Total number of returns that were prepared by taxpayer
Note: This category includes filers who purchased software to prepare and file returns from home, Free File Alliance filers, and some volunteer-facilitated self preparation
||Total number of returns prepared by a paid preparer
||Total number of returns prepared by volunteer organizations (VITA, Military VITA and TCE)
||Total number of returns prepared by taxpayer and filed electronically through the Free File Alliance online portal
||Total number of returns filed on Form 1040
||Total number of returns filed on Form 1040A
||Total number of returns filed on Form 1040EZ
||Total number of returns filed with an Individual Taxpayer Identification Number
Note: A return is counted in this category if anyone listed on the tax form uses an ITIN. Because ITIN filers cannot claim the EITC, this variable is not available when users select EITC filers as their query universe.
||Total number of returns that filed one or more of the following schedules: Schedule C (Profit or Loss from a Business); Schedule E (Supplemental Income and Loss); Schedule F (Profit or Loss from Farming)
||Total number of returns with Adjusted Gross Income less than $5,000
||Total number of returns with Adjusted Gross Income from $5,000 to $9,999
||Total number of returns with Adjusted Gross Income from $10,000 to $14,999
||Total number of returns with Adjusted Gross Income from $15,000 to $19,999
||Total number of returns with Adjusted Gross Income from $20,000 to $24,999
||Total number of returns with Adjusted Gross Income from $25,000 to $29,999
||Total number of returns with Adjusted Gross Income from $30,000 to $34,999
||Total number of returns with Adjusted Gross Income from $35,000 to $39,999
||Total number of returns with Adjusted Gross Income from $40,000 to $49,999
||Total number of returns with Adjusted Gross Income from $50,000 to $59,999
||Total number of returns with Adjusted Gross Income from $60,000 to $74,999
||Total number of returns with Adjusted Gross Income from $75,000 to $99,999
||Total number of returns with Adjusted Gross Income greater than or equal to $100,000
Why are some ZIP codes shown more than once?
The ZIP code-level data that the IRS provides to Brookings contain county, state, and city identifiers. However, the IRS assigns ZIP codes to cities and towns based primarily on information from the U.S. Postal Service, which associates a ZIP code with the name of the city or town nearest to its post office location. In many instances, this does not reflect the location of the bulk of the ZIP code itself, because ZIP codes do not conform to municipal boundaries. In addition, ZIP codes do not always conform to county boundaries. In cases where ZIP codes cross multiple counties, the IRS uses information from the U.S. Postal Service to identify the primary county.
To assign ZIP codes to cities and counties, we used Geographic Information Systems (GIS) and statistical software to identify where ZIP codes were located. For ZIP codes that cross city and/or county boundaries, we used Census block-level data (the smallest units for which the Census Bureau tabulates data), along with census places and ZIP code boundaries, to calculate the proportion of the ZIP code's households that lie within each geography. We undertake the same process to assign ZIP codes and partial ZIP codes to state legislative and congressional districts.
In the data returned to the user, some ZIP codes may be displayed more than once to indicate that they reflect "split" or partial ZIP codes, assigned to more than one city and/or county (or state legislative or congressional district, depending on the dataset requested). Return and dollar amount values in these instances are estimated by allocating ZIP code totals based on the percentage of the ZIP code's households that fall within the geography's borders. Additionally, some ZIP codes (or portions thereof) are not assigned to any city or town, reflecting their location in unincorporated county territory. (While not all ZIP codes are assigned to cities or metropolitan areas, all ZIP codes are associated with counties, state legislative districts, and congressional districts.)
For the sake of confidentiality, the IRS suppresses return counts of less than 10. We are able to impute suppressed totals at the ZIP code level for the following variables: eic, ctc, actc, ref, and bal; however, all other variables may be subject to data suppression.
Important Note: Because of the estimation techniques employed in assigning ZIP codes to cities counties, state legislative districts, and congressional districts, the data displayed here will differ from geography totals obtained directly from the IRS files. In addition, some data displayed will include return counts of less than 10. These represent estimates only. Small ZIP code-level counts should be interpreted only in conjunction with data from other ZIP codes (e.g., at the place, county, or district levels) and not as stand-alone entities.
Why do certain ZIP codes appear in some years and not others?
ZIP codes change over time. Boundaries may shift from year to year, new ZIP codes are created, and at times old ZIP codes are phased out of use. Thus not all ZIP codes will appear in each tax year's dataset. However, using GIS to assign ZIP codes to cities, counties, and districts (as described above) recognizes ZIP code changes over time so that generally the city-, county-, metropolitan area-, state-, and district-level estimates from this site are comparable from year to year. See this brief for changes that may affect comparisons over time. Note that because the ZIP code assignment methods used in this version of the website have been enhanced in each tax year to account for ZIP code changes over time, the estimates in this year's database may differ somewhat from past versions of the site.
How can I view EITC Interactive data on a map?
EITC Interactive data can be mapped for free using PolicyMap.com.* Users will find the data in the "Money and Income" tab under "Federal Tax Returns." Users can select whether to map data for all tax returns or for EITC returns only and which variables and years to view. Users can also zoom to particular locations and select the geography (county, congressional district, place, etc.) by which to shade the map. You do not need to create an account or purchase a subscription to PolicyMap.com in order to map EITC Interactive data.
* Tax Years 2009 and 2010 data for places, congressional districts, and legislative districts cannot currently be mapped on PolicyMap.com. These geographies will be available for mapping in early 2013.
Are additional data available?
The data available on this site reflect some, but not all, of the data published by IRS-SPEC in its Tax Return Information databases. Users interested in viewing the full complement of data available should contact their local SPEC Territory Managers to obtain a copy of the database.
In future years, the IRS will make additional data available to Brookings and interested users. As it does, we will update this website accordingly.
Back to top »
How many people in my community benefit from the EITC?
The total number of tax filers claiming the EITC is given by the eic field for the tax year you choose. In general, filers receive this and other tax credits in the year following the tax year for which they file. For instance, most filers who claimed the EITC for Tax Year 2010 received the credit upon filing returns in early 2011.
An important related measure is the proportion of tax filers in your community who receive the EITC. This indicates the relative importance of the EITC for workers and families in your area, and the degree to which people earn low wages. You can calculate this proportion by selecting All Tax Returns as the return type, and dividing the eic field by the return field for the tax year you choose. As a benchmark, in recent years roughly 19 to 20 percent of all tax filers nationally have claimed the EITC.*
*These figures are calculated from EITC Interactive part-year data from Tax Years 2009 and 2010. Different data sources may provide slightly different figures. See this brief for more information about comparing tax return data from different sources.
What's the typical value of the EITC that people in my community receive?
Depending on their income and family structure, tax filers may claim an EITC the value of which could range from $1 to over $5,000. For your community, in any given year you can find the average credit that EITC filers received by dividing the eicam field by the eic field. In tax year 2010, the average credit received among EITC filers nationwide was $2,247.
Two primary factors influence the average EITC claim in your community:
- Childless workers with very low incomes (under $13,460 for single filers in Tax Year 2010) are eligible for a much smaller credit -- up to $457 in 2010 -- than workers with children, whose maximum credits in 2010 are $3,050 (for families with one qualifying child) or $5,036 (for families with two or more qualifying children), and $5,666 (for families with three or more qualifying children). If a higher-than-average share of EITC recipients in your community are childless workers, the average credit amount will likely be smaller. Nationally, about 80 percent of EITC recipients claim the credit for families with qualifying children.
- The wages earned by low-income families influence the credit amount for which they are eligible. Nationally, almost half of filers who received the EITC in Tax Year 2010 had adjusted gross incomes above $15,000. For every additional dollar families with children earned above $16,450 ($21,460 for married families), the amount of credit they received decreased. Thus, EITC filers in higher-wage, higher cost-of-living areas tend to receive smaller credits; conversely, average credits in lower-wage areas tend to be larger.
Note that EITC dollars claimed and tax refunds received are not equivalent. Some EITC dollars (12 percent nationwide) offset income taxes that families owe, and thus do not translate directly into refund dollars. Additionally, families may claim other credits (like the Child Tax Credit, the Child and Dependent Care Tax Credit, and Education Credits) that add to their refunds, and some refund dollars represent taxes that were over-withheld over the course of the year. For most low-income families who receive tax refunds, however, the EITC makes up the largest part of those refunds— and is thus the most important part of the federal tax code for many low-income communities. Users can calculate the relative contribution of the EITC to low-income taxpayers' refunds by comparing the eicam and refam variables for EITC Returns in the selected geography and tax year.
What do the RAL and RAC variables mean? How can I use them?
RAL is an abbreviation for refund anticipation loan while RAC stands for refund anticipation check, products sold by most commercial tax preparers. Low-income taxpayers who claim the EITC represent the majority of the marketplace for both products.
By purchasing a RAL, the tax filer assigns the proceeds of his/her tax refund to the preparer's bank partner, and the preparer arranges a loan for the taxpayer in the amount of his/her refund, net of fees for tax preparation and the loan itself. The bank makes the loan available to the taxpayer within 1-2 days, and the IRS typically delivers the taxpayer's refund to the bank within about 10 days. For this short-term loan, the taxpayer often pays fees in excess of $100 (in addition to the fees they pay to have their taxes prepared and filed), and incurs an implicit annual interest rate on the loan of 250 percent or higher. Starting in Tax Year 2010, the IRS no longer provides commercial tax preparers with a "debt indicator," an indication of whether the taxpayer has outstanding debt. As a result, research has shown the number of returns requesting a RAL has dramatically decreased. In Tax Year 2007, 38 percent of EITC recipients using paid preparers requested RALs. By 2010, this percentage dropped to just over 5 percent.
As the use of RALs has declined, an increasing number of tax filers have requested RACs. With a RAC, instead of issuing a loan within 1-2 days, the bank opens a temporary bank account into which the IRS direct deposits the refund check. After the refund is deposited, the bank issues the consumer a paper check or prepaid debit card with the RAC proceeds and closes the temporary account. This process usually takes 7-15 days. Between Tax Year 2007 and 2010, the percentage of EITC recipients using paid preparers who requested RACs increased from 26 percent to 56 percent.
The ral variable in the database represents the number of returns for which the taxpayer requested a RAL. To determine the proportion of filers in your community who requested RALs, divide the ral field by the ref field for the tax year selected. You can calculate this proportion either for Total Tax Returns or EITC Returns Only. To determine the proportion of filers requesting RACs in a given year, divide the rac field by the ref variable.
To estimate how much money EITC filers in your community are spending on RALs and RACs, investigate what local firms charge for the service. Ask local taxpayers who have received the EITC and purchased a refund product in the past if they'd be willing to share their documentation with you. Often these products are referred to familiarly as "rapid refund" or "fast cash" products. Typically, tax preparers and their bank partners charge a fee for the RAL that is based on the size of the anticipated refund, plus additional flat "documentation" or "loan preparation" fees. With information on the average price for these products in your community, and the number of RALs and RACs requested by EITC filers in years past, you can estimate the amount that low-income filers in your community spend on these high-cost refund products.
How many people in my community are eligible for the EITC, but don't receive it? What's the amount that is "left on the table" as a result?
The best available research by the IRS and other scholars suggests that between 80 and 85 percent of tax filers who are eligible for the EITC claim the credit. That participation rate exceeds rates for other well-known income support programs like Food Stamps and TANF cash assistance. However, because the EITC can provide a family with such a significant cash infusion, and because a broader range of working families are eligible for the credit than for other means-tested programs, local organizations are devoting significant effort to alerting potentially eligible families about how to claim the EITC.
The IRS continues to work on research methods that will provide better estimates on participation rates in the EITC at a small-area level. (For a detailed explanation of the issues involved in arriving at such estimates, see: Earned Income Credit Participation—What We (Don't) Know) In the interim, because the actual population that claims the EITC changes so much from year to year (one-third of filers who claimed the credit in any given year did not claim it the prior year), and because some families will inevitably miss out on the credit despite the best outreach efforts, the total number of eligible families not claiming the credit in your community may not provide the most useful benchmark for judging outreach efforts.
A more useful approach might be to ask, if you were able to increase the number of eligible filers in your area who claim the credit by 5 percent, how many additional workers and families would benefit, and how many additional EITC dollars would flow into your community? The participation gap is likely to be larger in communities that have more: very low-income working families (incomes under $10,000); low-income Hispanic families and families whose first language is not English; and families with more than two children. Eligible members of these groups have been found to claim the credit at lower rates than the national average. If your community has large numbers of these types of families, you may have the opportunity to raise the number of families claiming the credit by perhaps as much as 10 percent. To calculate the potential additional number of eligible filers, simply multiply the eic value for the most recent year by 5 percent to 10 percent.
Number of EITC-eligible non-filers that an effective outreach campaign could encourage to file = eic x 5% to 10%
Research also suggests that eligible filers who fail to claim the credit are typically eligible for somewhat smaller credits on average than those filers who do claim the credit. To calculate the potential economic benefit for these families and your community, multiply the average EITC in your community (eicam divided by eic) by 50 percent, and multiply the result by the estimated number of additional eligible filers you hope to reach:
EITC dollars that an outreach campaign could add to families and the community = eic x 5% to 10% x (eicam /eic) x 50% = eicam x 2.5% to 5%
Finally, the introduction of the Additional Child Tax Credit (ACTC)—the refundable version of the Child Tax Credit (CTC)—increases the refund amounts available to many EITC-eligible filers. National figures show that taxpayers claimed over $25 billion in ACTC in Tax Year 2010, with over 75 percent of those dollars ($19 billion) going to EITC filers. For Tax Years 2004 through 2010, users can calculate the relative contribution of the ACTC to taxpayers' refunds by comparing the actcam and refam variables for either Total Tax Returns or EITC Returns in the selected geography and tax year.
Note that these estimates would provide you with the potential impact on a one-year basis, and would not take into account broader factors that also influence the number of filers in a particular community who receive the credit, including: population growth/decline; changing employment and wage levels; and increases and decreases in the number of children living at home. At the same time, because they represent only one-year estimates, longer-range plans should take into account that these potential economic benefits would recur on an annual basis.
Can I use these data to assess the impact that my campaign has had on participation in the EITC?
As noted above, a number of factors influence the number of filers, and proportion of total filers, who claim the EITC in a given community. Knowledge of the credit is just one of these factors. Economic and demographic changes arguably exert even greater influence on EITC usage.
Recognizing that, the data provided here may afford outreach coordinators the opportunity to track the number of EITC claimants, and the proportion of total filers they represent, over time and to compare results to those from similar communities that are not targets for outreach. Economic forces typically operate at the regional level, so for purposes of comparison you might look for other communities of similar size and population makeup within your region. These comparisons may work best for larger units of geography—cities and counties—rather than small units like ZIP codes where population and employment changes may play a larger role.
One area in which you could more easily track the impact of community outreach efforts regards the usage of RALs and RACs. This measure may be more sensitive to outreach and public awareness campaigns, and less sensitive to broader changes in the local population and economy. Further reducing the proportion of taxpayers who use RALs and RACs over time can help keep more money in working families' pockets. Other areas to track include the use of paid preparers as well as returns filed through volunteer tax preparers.
Can I use these data to characterize the total economic impact that EITC dollars have in my local economy?
Back to top »
Economists often describe the total economic impact of a fiscal injection such as the EITC into a local economy through the use of an "economic multiplier." The multiplier represents the factor by which total economic output resulting from the initial investment exceeds that investment, due to the additional economic activity it spurs. The multiplier in any given local economy depends on the interdependence of its different sectors, so it may vary widely from community to community. One example of research on the total economic impact of EITC dollars, for the city of San Antonio, found that refunded EITC dollars spent in the local economy would generate a total economic impact 58 percent larger than those initial expenditures.