EITC Interactive: User Guide and Data Dictionary



INTERACTIVE DATA

Display and download ZIP code-level tax return information for states, metro areas, counties, cities, and state legislative and congressional districts for tax years 1997 through 2006.

  • Create a Report
  • Download Note on EITC Participation

  • This guide answers general questions about the operation of the interactive website and the information it provides, as well as specific questions users might ask about how to use some of these data.

    For more information about the database and using the data, contact Elizabeth Kneebone.

    New Feature

    General Questions

    EITC-Specific Questions

    New Feature

    How can I view EITC Interactive data on a map?
    Users can now map EITC Interactive data for tax years 2000 to 2006. After requesting data through the EITC Interactive application (by following the steps described below) users can click on any cell in the HTML report to view that indicator on PolicyMap.com.

    Example: a user requests tax year 2006 data on the use of refund anticipation products for all tax filers in the cities of Chicago and Springfield, Illinois. EITC Interactive will return a data report with 74 records, or one record for each ZIP code (or portion of a ZIP code) located in Chicago and Springfield. Each record will have three data columns—the first reports total tax filers, the second lists all Refund Anticipation Loans (RALs) requested, and the third shows all Refund Anticipation Checks (RACs) requested.

    To view a map: If the user wants to map the number of RALs in Chicago, clicking on the RAL cell for any ZIP code associated with Chicago will take them to a city-level map highlighting Chicago’s data.

    Alternatively, a user could click on the RAC cell in any Springfield ZIP code to be taken directly to Springfield’s map for that indicator.

    Using the PolicyMap legend: Once on PolicyMap, users can make changes to their map through the legend located on the left side of the page. For instance, in the Chicago RAL example, users can:

    • Click on the % sign under the “Change Variable” heading to show the share of all filers in Chicago requesting a RAL. (To get back to the absolute number, click on the # symbol under the same heading.)
    • Click on any highlighted year under the “Change Year” heading to see RAL data for that year
    • Choose to view RAL data for another community type by clicking on the “Shade by” dropdown menu and selecting a different level of geography
    Note: EITC Interactive data are available for free on Policymap.com. Users do not need to create an account or purchase a subscription to PolicyMap.com in order to map EITC Interactive data. Creating a PolicyMap account or purchasing a subscription allows users more options for viewing and manipulating data on PolicyMap, but is not a requirement.


    General Questions
    The latest version of the Brookings EITC Interactive website contains data for tax years 1997 through 2006. Tax return data are summarized to different geographic areas including ZIP code, city, county, metropolitan area, state, state legislative district, and congressional district. All data are derived from the Internal Revenue Service's Stakeholder Partnerships, Education, and Communication (IRS-SPEC) Return Information Databases, compiled by the IRS Wage and Investment Research Unit. This version of the site supersedes all previous versions.

    What's new in this installment of the website?
    In addition to the range of information available in past years, the site now offers users data through tax year 2006, the most recent year for which complete IRS-SPEC data are available.

    What do the different geography types mean?
    EITC Interactive allows users to request data for a variety of different communities. ZIP code-level IRS data are aggregated to create larger geographic units, including cities, counties, metro areas, states, state legislative districts, and congressional districts. Each geography type is briefly described below.

    ZIP Codes are designated by the U.S. Postal Service and are the smallest areas for which the IRS reports tax return data. ZIP code boundaries (and the total number of ZIP codes) can change from year to year based on local population and business dynamics. ZIP codes provide the building blocks for all other geography types on the EITC Interactive site.

    Places (Cities/Towns) include incorporated places, such as cities, towns, and villages, as well as census-designated places, which are unincorporated areas delineated by the U.S. Census Bureau for statistical purposes.

    Counties represent the primary legal subdivision of most states. Exceptions include Alaska and Louisiana which are divided into boroughs and parishes, respectively. Data for all U.S. counties, boroughs, and parishes are available on the EITC Interactive site under the "county" geography type.

    Metropolitan Areas are composed of counties. Metro names in this version of the site are based on the 2003 metropolitan statistical area definitions issued by the U.S. Office of Management and Budget.

    State reports, in addition to state totals, provide data on all ZIP codes in the state including their associated cities, counties, and metro areas.

    States Legislative Districts reflect boundaries in effect as of 2006. EITC Interactive includes data for lower chamber (generally House) and upper chamber (generally Senate) districts in all 50 states and the District of Columbia. Note that Nebraska has a unicameral system. Data for Nebraska can be accessed through either the "lower chamber" or "upper chamber" geography type.

    Congressional Districts represent the 110th United States Congress, which is in effect from January 3, 2007 to January 3, 2009. The apportionment of seats in the House of Representatives is based on population as of Census 2000.

    How can I access tax return data?

    The site first prompts users to choose the geography and year for which they want to view tax return information. Regardless of the type of geography chosen, the site will provide data at the ZIP code level; the site will also provide a total row at the bottom of each report that sums all ZIP codes within the selected geography.

    After selecting a geography and year, the site then prompts the user to select the desired return type—either "All Tax Returns" or "EITC Returns Only"—and, in years 1999 through 2006, the variable categories. The user must then request the data in either HTML or MS Excel format.

    What is the "return type?"
    The user must select data relating either to All Tax Returns or EITC Returns Only. Data on All Tax Returns reflect the characteristics of all individual income tax returns filed in the selected geography. Data on EITC Returns reflect the characteristics of only those returns receiving the EITC in the selected geography. For instance, if the user wants to know how many tax filers overall received a refund, or used a paid preparer, he/she would select "All Tax Returns." If a user was instead interested in the total amount of refunds received by filers receiving the EITC, or how many EITC recipients received a Refund Anticipation Loan, he/she would select "EITC Returns Only."

    What do the variable categories mean?
    The website allows users to select from among several categories of variables to display for tax years 1999 through 2006 (the expanded data are not available for tax years 1997 and 1998). If the user is interested in the full complement of data in any given year, he/she can leave the boxes checked, and all available variables will be returned. If the user is interested only in select variables in any given year (e.g., only information on the number of EITC filers and EITC refunds received), he/she can de-select variable categories to exclude them from the data returned. Users can position the cursor over any of the variable category names to obtain further information about the data associated with each.

    What do the column headers in the data represent?
    The website returns data to the user with a series of abbreviated column header names.

    Each column header contains a prefix and a suffix. The prefix refers to the return type selected:

    • "t" refers to "All Tax Returns" (t=total)
    • "e" refers to "EITC Returns Only" (e=EITC)
    The suffix refers to the tax year of data selected:

    • 97 - Tax year 1997
    • 98 - Tax year 1998
    • 99 - Tax year 1999
    • 00 - Tax year 2000
    • 01 - Tax year 2001
    • 02 - Tax year 2002
    • 03 - Tax year 2003
    • 04 - Tax year 2004
    • 05 - Tax year 2005
    • 06 - Tax year 2006
    The root of the variable name refers to the descriptions below.

    Variable name (root) Variable Description
    return Total number of returns
    new Total number of returns where the taxpayer did not file a return in the previous tax year
    eic Total number of returns receiving the Earned Income Tax Credit (EITC)
    eicam Sum of EITC received
    ctc Total number of returns receiving the Child Tax Credit
    ctcam Sum of Child Tax Credit received
    actc Total number of returns receiving the refundable portion of the Child Tax Credit
    actcam Sum of the refundable Child Tax Credit received
    cdctc Total number of returns filing Form 2441 (Child and Dependent Care Expenses)
    edcr Total number of returns filing Form 8863 (Education Credits)
    sld Total number of returns receiving a deduction for Student Loan Interest
    ref Total number of returns receiving a refund
    refam Sum of refunds received
    bal Total number of returns with a balance due after remittance
    balam Sum of balance due after remittance
    dirdp Total number of returns receiving direct deposit of refund
    ral Total number of returns requesting a Refund Anticipation Loan (RAL)
    rac Total number of returns requesting a Refund Anticipation Check (RAC)
    self Total number of returns that were prepared by taxpayer
    paid Total number of returns prepared by a paid preparer
    vol Total number of returns prepared by volunteer organizations (VITA, Military VITA and TCE)
    freef Total number of returns prepared by taxpayer and filed electronically through the Free File Alliance online portal
    1040_ Total number of returns filed on Form 1040
    1040a Total number of returns filed on Form 1040A
    1040z Total number of returns filed on Form 1040EZ
    itin Total number of returns filed with an Individual Taxpayer Identification Number
    cef Total number of returns that filed one or more of the following schedules: Schedule C (Profit or Loss from a Business); Schedule E (Supplemental Income and Loss); Schedule F (Profit or Loss from Farming)
    agi0_ Total number of returns with Adjusted Gross Income less than $5,000
    agi5_ Total number of returns with Adjusted Gross Income from $5,000 to $9,999
    agi10_ Total number of returns with Adjusted Gross Income from $10,000 to $14,999
    agi15_ Total number of returns with Adjusted Gross Income from $15,000 to $19,999
    agi20_ Total number of returns with Adjusted Gross Income from $20,000 to $24,999
    agi25_ Total number of returns with Adjusted Gross Income from $25,000 to $29,999
    agi30_ Total number of returns with Adjusted Gross Income from $30,000 to $34,999
    agi35_ Total number of returns with Adjusted Gross Income from $35,000 to $39,999
    agi40_ Total number of returns with Adjusted Gross Income from $40,000 to $49,999
    agi50_ Total number of returns with Adjusted Gross Income from $50,000 to $59,999
    agi60_ Total number of returns with Adjusted Gross Income from $60,000 to $74,999
    agi75_ Total number of returns with Adjusted Gross Income from $75,000 to $99,999
    agi1k_ Total number of returns with Adjusted Gross Income greater than or equal to $100,000

    Why are some ZIP codes shown more than once?
    The ZIP code-level data that the IRS provides to Brookings contain county, state, and city identifiers. However, the IRS assigns ZIP codes to cities and towns based primarily on information from the U.S. Postal Service, which associates a ZIP code with the name of the city or town nearest to its post office location. In many instances, this does not reflect the location of the bulk of the ZIP code itself, because ZIP codes do not conform to municipal boundaries. In addition, ZIP codes do not always conform to county boundaries. In cases where ZIP codes cross multiple counties, the IRS uses information from the U.S. Postal Service to identify the primary county.

    To assign ZIP codes to cities and counties, we used Geographic Information Systems (GIS) and statistical software to identify where ZIP codes were located. For ZIP codes that cross city and/or county boundaries, we used Census 2000 block-level data (the smallest units for which the Census Bureau tabulates data), along with census places and ZIP code boundaries, to calculate the proportion of the ZIP code's households that lie within each geography. We undertake the same process to assign ZIP codes and partial ZIP codes to state legislative and congressional districts.

    In the data returned to the user, some ZIP codes may be displayed more than once to indicate that they reflect "split" or partial ZIP codes, assigned to more than one city and/or county (or state legislative or congressional district, depending on the dataset requested). Return and dollar amount values in these instances are estimated by allocating ZIP code totals based on the percentage of the ZIP code's households that fall within the geography's borders. Additionally, some ZIP codes (or portions thereof) are not assigned to any city or town, reflecting their location in unincorporated county territory. (While not all ZIP codes are assigned to cities or metropolitan areas, all ZIP codes are associated with counties, state legislative districts, and congressional districts.)

    For the sake of confidentiality, the IRS suppresses return counts of less than 10. We are able to impute suppressed totals at the ZIP code level for the following variables: eic, ctc, actc, ref, and bal; however, all other variables may be subject to data suppression.

    Important Note: Because of the estimation techniques employed in assigning ZIP codes to cities, counties, state legislative districts, and congressional districts, the data displayed here will differ from geography totals obtained directly from the IRS files. In addition, some data displayed will include return counts of less than 10. These represent estimates only. Small ZIP code-level counts should be interpreted only in conjunction with data from other ZIP codes (e.g., at the place, county, or district levels) and not as stand-alone entities.

    Why do certain ZIP codes appear in some years and not others?
    ZIP codes change over time. Boundaries may shift from year to year, new ZIP codes are created, and at times old ZIP codes are phased out of use. Thus not all ZIP codes will appear in each tax year's dataset. However, using GIS to assign ZIP codes to cities, counties, and districts (as described above) recognizes ZIP code changes over time so that the city-, county-, metropolitan area-, state-, and district-level estimates from this site are comparable from year to year. Note that because the ZIP code assignment methods used in this version of the website have been enhanced in each tax year to account for ZIP code changes over time, the estimates in this year's database may differ somewhat from past versions of the site.

    Are additional data available?
    The data available on this site reflect some, but not all, of the data published by IRS-SPEC in its Tax Return Information databases. Users interested in viewing the full complement of data available should contact their local SPEC Territory Managers to obtain a copy of the database.

    In future years, the IRS will make additional data available to Brookings and interested users. As it does, we will update this website accordingly.

    EITC-Specific Questions

    How many people in my community benefit from the EITC?
    The total number of tax filers claiming the EITC is given by the eic field for the tax year you choose. In general, filers receive this and other tax credits in the year following the tax year for which they file. For instance, most filers who claimed the EITC for tax year 2006 received the credit upon filing returns in early 2007.

    An important related measure is the proportion of tax filers in your community who receive the EITC. This indicates the relative importance of the EITC for workers and families in your area, and the degree to which people earn low wages. You can calculate this proportion by selecting All Tax Returns as the return type, and dividing the eic field by the return field for the tax year you choose. As a benchmark, in recent years roughly 16 to 17 percent of all tax filers nationally have claimed the EITC.

    What's the typical value of the EITC that people in my community receive?
    Depending on their income and family structure, tax filers may claim an EITC the value of which could range from $1 to over $4,000. For your community, in any given year you can find the average credit that EITC filers received by dividing the eicam field by the eic field. In tax year 2006, the average credit received among EITC filers nationwide was $1,951.

    Two primary factors influence the average EITC claim in your community:

    • Childless workers with very low incomes (under $12,120 for single filers) are eligible for a much smaller credit- up to $412 in tax year 2006- than workers with children, whose maximum credits are $2,747 (for families with one qualifying child) or $4,536 (for families with two or more qualifying children) in tax year 2006. If a higher-than-average share of EITC recipients in your community are childless workers, the average credit amount will likely be smaller. Nationally, about 80 percent of EITC recipients claim the credit for families with qualifying children.
    • The wages earned by low-income families influence the credit amount for which they are eligible. Many families with children who received the EITC in tax year 2006 had adjusted gross incomes above $15,000, which means that for every additional dollar they earned, the amount of credit they receive decreases. Thus, EITC filers in higher-wage, higher cost-of-living areas tend to receive smaller credits; conversely, average credits in lower-wage areas tend to be larger.
    Note that EITC dollars claimed and tax refunds received are not equivalent. Some EITC dollars (12 percent nationwide) offset income taxes that families owe, and thus do not translate directly into refund dollars. Additionally, families may claim other credits (like the Child Tax Credit, the Child and Dependent Care Tax Credit, and Education Credits) that add to their refunds, and some refund dollars represent taxes that were over-withheld over the course of the year. For most low-income families who receive tax refunds, however, the EITC makes up the largest part of those refunds— and is thus the most important part of the federal tax code for many low-income communities. Users can calculate the relative contribution of the EITC to low-income taxpayers' refunds by comparing the eicam and refam variables for EITC Returns in the selected geography and tax year.

    What do the RAL and RAC variables mean? How can I use them?
    RAL is an abbreviation for refund anticipation loan while RAC stands for refund anticipation check, products sold by most commercial tax preparers. By purchasing a RAL, the tax filer assigns the proceeds of his/her tax refund to the preparer's bank partner, and the preparer arranges a loan for the taxpayer in the amount of his/her refund, net of fees for tax preparation and the loan itself. The bank makes the loan available to the taxpayer within 1-2 days, and the IRS typically delivers the taxpayer's refund to the bank within about 10 days. For this short-term loan, the taxpayer often pays fees in excess of $100 (in addition to the fees they pay to have their taxes prepared and filed), and incurs an implicit annual interest rate on the loan of 250 percent or higher. With a RAC, instead of issuing a loan within 1-2 days, the bank waits to receive the funds from the IRS before issuing a check to the filer via the paid tax preparer. This process usually takes 7-15 days. Commercial tax preparers often offer RACs as a default product for filers who are initially denied a RAL. Some filers use RACs to pay commercial tax preparation fees from the proceeds of their tax refunds.

    Low-income taxpayers who claim the EITC represent the majority of the marketplace for RALs. The product's popularity varies substantially across the U.S., but the most recent data indicate that 30 percent of all refund recipients who received the EITC in tax year 2006 requested a RAL. In the same year, 19 percent of EITC recipients requested a RAC. Recent research suggests that between 10 and 15 percent of all RAL requests are rejected, though by defaulting to a RAC rejected RAL applications still result in a fee for the taxpayer.

    The ral variable in the database represents the number of returns for which the taxpayer requested a RAL. To determine the proportion of filers in your community who requested RALs, divide the ral field by the ref field for the tax year selected. You can calculate this proportion either for Total Tax Returns or EITC Returns Only. To determine the proportion of filers requesting RACs in a given year, divide the rac field by the ref variable.

    To estimate how much money EITC filers in your community are spending on RALs and RACs, investigate what local firms charge for the service. Ask local taxpayers who have received the EITC and purchased a refund product in the past if they'd be willing to share their documentation with you. Often these products are referred to familiarly as "rapid refund" or "fast cash" products. Typically, tax preparers and their bank partners charge a fee for the RAL that is based on the size of the anticipated refund, plus additional flat "documentation" or "loan preparation" fees. With information on the average price for these products in your community, and the number of RALs and RACs requested by EITC filers in years past, you can estimate the amount that low-income filers in your community spend on these high-cost refund products.

    Important note: The IRS recorded a significant drop in the volume of RALs requested between tax years 2003 and 2004. Some of this drop may be attributable to improved reporting that distinguished RALs from other non-loan financial products sold by tax preparers. Some may be attributable to increased public awareness of the problems with RALs.

    How many people in my community are eligible for the EITC, but don't receive it? What's the amount that is "left on the table" as a result?
    The best available research by the IRS and other scholars suggests that between 80 and 85 percent of tax filers who are eligible for the EITC claim the credit. That participation rate exceeds rates for other well-known income support programs like Food Stamps and TANF cash assistance. However, because the EITC can provide a family with such a significant cash infusion, and because a broader range of working families are eligible for the credit than for other means-tested programs, local organizations are devoting significant effort to alerting potentially eligible families about how to claim the EITC.

    The IRS continues to work on new research methodology that will provide better estimates on participation rates in the EITC at a small-area level. (For a detailed explanation of the issues involved in arriving at such estimates, see: Earned Income Credit Participation—What We (Don't) Know) In the interim, because the actual population that claims the EITC changes so much from year to year (one-third of filers who claimed the credit in any given year did not claim it the prior year), and because some families will inevitably miss out on the credit despite the best outreach efforts, the total number of eligible families not claiming the credit in your community may not provide the most useful benchmark for judging outreach efforts.

    A more useful approach might be to ask, if you were able to increase the number of eligible filers in your area who claim the credit by 5 percent, how many additional workers and families would benefit, and how many additional EITC dollars would flow into your community? The participation gap is likely to be larger in communities that have more: very low-income working families (incomes under $10,000); low-income Hispanic families and families whose first language is not English; and families with more than two children. Eligible members of these groups have been found to claim the credit at lower rates than the national average. If your community has large numbers of these types of families, you may have the opportunity to raise the number of families claiming the credit by perhaps as much as 10 percent. To calculate the potential additional number of eligible filers, simply multiply the eic value for the most recent year by 5 percent to 10 percent.

      Number of EITC-eligible non-filers that an effective outreach campaign could encourage to file = eic x 5% to 10%
    Research also suggests that eligible filers who fail to claim the credit are typically eligible for somewhat smaller credits on average than those filers who do claim the credit. To calculate the potential economic benefit for these families and your community, multiply the average EITC in your community (eicam divided by eic) by 75 percent, and multiply the result by the estimated number of additional eligible filers you hope to reach:

      EITC dollars that an outreach campaign could add to families and the community = eic x 5% to 10% x (eicam /eic) x 75% = eicam x 3.75% to 7.5%
    Finally, the introduction of the Additional Child Tax Credit (ACTC)—the refundable version of the Child Tax Credit (CTC)—increases the refund amounts available to many EITC-eligible filers. National figures show that taxpayers claimed almost $15.5 billion in ACTC in tax year 2006, with 61 percent of those dollars ($9.6 billion) going to EITC filers. For tax years 2004 through 2006, users can calculate the relative contribution of the ACTC to taxpayers' refunds by comparing the actcam and refam variables for either Total Tax Returns or EITC Returns Only in the selected geography and tax year.

    Note that these estimates would provide you with the potential impact on a one-year basis, and would not take into account broader factors that also influence the number of filers in a particular community who receive the credit, including: population growth/decline; changing employment and wage levels; and increases and decreases in the number of children living at home. At the same time, because they represent only one-year estimates, longer-range plans should take into account that these potential economic benefits would recur on an annual basis.

    Can I use these data to assess the impact that my campaign has had on participation in the EITC?
    As noted above, a number of factors influence the number of filers, and proportion of total filers, who claim the EITC in a given community. Knowledge of the credit is just one of these factors. Economic and demographic changes arguably exert even greater influence on EITC usage.

    Recognizing that, the data provided here may afford outreach coordinators the opportunity to track the number of EITC claimants, and the proportion of total filers they represent, over time and to compare results to those from similar communities that are not targets for outreach. Economic forces typically operate at the regional level, so for purposes of comparison you might look for other communities of similar size and population makeup within your region. These comparisons may work best for larger units of geography—cities and counties—rather than small units like ZIP codes where population and employment changes may play a larger role.

    One area in which you could more easily track the impact of community outreach efforts regards the usage of RALs and RACs. This measure may be more sensitive to outreach and public awareness campaigns, and less sensitive to broader changes in the local population and economy. Reducing the proportion of taxpayers who use RALs and RACs over time can help keep more money in working families' pockets.

    Can I use these data to characterize the total economic impact that EITC dollars have in my local economy?
    Economists often describe the total economic impact of a fiscal injection such as the EITC into a local economy through the use of an "economic multiplier." The multiplier represents the factor by which total economic output resulting from the initial investment exceeds that investment, due to the additional economic activity it spurs. The multiplier in any given local economy depends on the interdependence of its different sectors, so it may vary widely from community to community. One example of research on the total economic impact of EITC dollars, for the city of San Antonio, found that refunded EITC dollars spent in the local economy would generate a total economic impact 58 percent larger than those initial expenditures.