Skip to main content
National Longitudinal Survey of Youth 1979 (NLSY79)

Geographic Residence & Neighborhood

Created variables

PUBLIC USE VARIABLES

  • Region of residence at each survey date (Northeast, North Central, South, or West)
  • Information on whether the current residence is in an urban or rural county
  • Through 1996, this series was based on the respondent's State and county of residence and the "% urban population" data from the County & City Data Book. From 1998-2002 this item was based on whether the respondent was living in an urbanized area or in area with a population greater than 2,500. Beginning in 2004, this item indicates whether the respondent resides within an urban cluster or urbanized area. For further information see the Geocode Codebook Supplement.
  • Information on whether the current residence is in a Metropolitan Statistical Area (MSA), the central city of an MSA, or outside of an MSA
  • Based upon zip code, State, and county matches with metropolitan statistical designations for place of residence, the location of the respondent is determined to be within or outside of a metropolitan statistical area.
  • Beginning in 1988, whether the current residence is in the United States

GEOCODE FILE VARIABLES

  • The specific county and State (both edited) of residence at the time of interview, coded with Federal Information Processing Standards (FIPS) codes
  • Similar information is provided for the respondent's residence at birth and at age 14
  • The specific metropolitan area of residence at the time of interview. As applicable, information may be included for the following types of metropolitan areas:
    • SMSA-Standard Metropolitan Statistical Area
    • MSA-Metropolitan Statistical Area
    • CMSA-Consolidated Metropolitan Statistical Area
    • PMSA-Primary Metropolitan Statistical Area
    • NECMA-New England County Metropolitan Area
    • CBSA-Core Based Statistical Area
  • Distance between respondent addresses at each interview round (see Appendix 22: Migration Distance Variables for Respondent Locations).
  • This supplements the data on state and county of residence and is available only on the geocode release
  • The distance between the respondent's addresses at each date of interview was created for all unique pairs of survey years
  • The data described here do not actually provide a location for the respondent's residence; these variables only provide distances between the various places the respondent lives
  • This pairwise matrix of variables enables various types of migration research by enabling users to consider the distance between residences and to identify return migration to an area where the respondent has lived in the past
  • Indicators of the quality of the geographic data:
    • May not have an address for the respondent
    • In such cases the respondent's address is geocoded to the centroid of the zipcode when we can determine the zipcode
    • To identify these cases, an indicator for the quality of this distance measure was created based on the quality of the matches in both years
  • Indicator for whether the respondent was located in the same zip code, was created for all pairs of years

Important information about using restricted-use Geocode data

  • The level of detail available determines whether a variable is placed within the restricted release "Geocode xxxx" files. For example, general country level information, such as whether the respondent resided at various points in time within or outside of the United States, is available to all users with no restriction, while the specific county or SMSA in which he or she resided at a specific interview point is present only within the restricted-use Geocode data files.
  • Researchers interested in using restricted-use Geocode data must submit an application to BLS. These confidential files are available for use only at the BLS National Office in Washington, DC, and at Federal Statistical Research Data Centers (FSRDCs) on statistical research projects approved by BLS. Access to data is subject to the availability of space and resources. Information about applying to use the zip code and Census tract data is available on the BLS Restricted Data Access page.
  • The "Misc. xxxx" areas of interest contain a set of variables titled 'Does R Live on a Farm or in a Rural Area?' The interviewer answers this question based on observation when at the respondent's permanent residence; if the interview takes place elsewhere, the interviewer asks the respondent about the place of residence. There are no consistent criteria for the definition of nonfarm property as rural. These variables should not be considered a replacement for the created KEY VARIABLE, 'Current Residence Urban/Rural?'
  • The coding of respondents' geographic location before 1993 required extensive hand-editing and may not be completely accurate. The most common error is the potential assignment of a respondent to an adjacent county of residence. Data on addresses, zip codes, and phone numbers are used to clean the geographic codes. The post-1988 use of telephone number information improved data quality. A brief discussion below provides more information on both the hand-edits performed each year and the created variable that indicates the extent of hand-editing required for each case; see Appendix 10 in the Geocode Codebook Supplement for more details.

Geographic data for NLSY79 respondents fall into two categories: information on the main public file and more detailed information released as restricted-use Geocode data. Table 1 lists NLSY79 geographic variables along with their areas of interest. Variables with a "Geocode xxxx" area of interest are restricted-use data; all others are public use.

Table 1. Select Residence Variables by Survey Year and Area of Interest: NLSY79 Main and Geocode Files
Variables Survey Year(s) Area of Interest Documentation
Residence at Birth      
  Country - U.S. or Other Country 1979, 1983 Geocode 1979 --
  Country - Actual Other Country 1979 Geocode 1979 Attachment 101
  County 1979 Geocode 1979 Attachment 102
  State 1979 Geocode 1979 Attachment 102
  South/Non-South 1979 Family Background Attachment 100
Residence at Age 14      
  Country - U.S. or Other Country 1979 Geocode 1979 --
  Country - Actual Other Country 1979 Geocode 1979 Attachment 101
  County 1979 Geocode 1979 Attachment 102
  State 1979 Geocode 1979 Attachment 102
  South/Non-South 1979 Family Background Attachment 100
  Area of Residence - Urban/Rural 1979 Family Background User's Guide and Appendix 6
Present Residence      
  Lived in Since Birth 1979 Family Background --
  Year of Move to 1979 Family Background --
Most Recent Residence      
  5th-1st Country/County/State Since Jan. 1978 1979 Geocode 1979 Attachment 101
Attachment 102
  Month/Year of Move(s) 1979 Family Background --
  5th-1st Country/County/State Since Last Int. 1980 Geocode 1980 Attachment 101
Attachment 102
  Month/Year of Move(s) 1980 Family Background Attachment 102
  9th-1st Country/County/State Since 1980 Int. 1982 Geocode 1982 Attachment 101
Attachment 102
  Month/Year of Move(s) 1982 Family Background --
Current Residence      
  Region 1979-2020 Key Variables Attachment 100
  Urban/Rural 1979-2020 Key Variables User's Guide and Appendix 6
  SMSA/Central City 1979-2020 Key Variables User's Guide and Appendix 6
  In U.S. 1979-2020 Misc. xxxx NLSY79 User's Guide
  County 1979-2020 Geocode xxxx Attachment 102
  State 1979-2020 Geocode xxxx Attachment 102
  SMSA 1979-2020 Geocode xxxx Attachment 104
  PMSA 1979-2020 Geocode xxxx Attachment 104
  MSA 1979-2020 Geocode xxxx Attachment 104
  CMSA 1979-2020 Geocode xxxx Attachment 104
  MSA/CMSA/NECMA 1979-2020 Geocode xxxx Appendix 10
  CBSA 1979-2020 Geocode xxxx Appendix 10
  Main Reason for Moving Since Date of Last Interview 2018, 2020 Family Background NLSY79 User's Guide
         

Related Variables: Related NLSY79 main file variables discussed in the  Household Composition and Family Background sections of this guide include:

  • Type of residence or dwelling unit at the time of interview (such as dorm, hospital, jail, orphanage, own home, and so forth)
  • Childhood living arrangements of NLSY79 respondents from birth to age 18, including not only information on persons with whom the respondent lived (such as biological versus adoptive and step-parents) but also on institutions such as children's homes, group care homes, or detention centers/jails/prisons in which he or she may have resided.

Geocode file variables

  • Information on the State, county, and metropolitan statistical area of residence for each respondent (the current residence variables) are merged with information from several other data files, namely the City Reference File (Census 1973, 1982, 1983, 1987, 1992) and the County & City Data Book (Census 1972, 1977, 1983, 1988, 1994), to provide detailed information on the environmental characteristics of the State, county, and metropolitan statistical areas in which each NLSY79 respondent resides. NOTE: Users may attach additional county and metropolitan statistical area-level data from a variety of sources by simply merging information from the desired source with the Geocode data based upon the State, county, and metropolitan statistical area of residence codes in the Geocode file
  • For select survey years Geocode information is available on the location of respondents' jobs, the location of colleges attended, and the point of discharge from military service
  • Unemployment rate of each respondent's labor market of current residence:
    • The source of the 'Unemployment Rate' variables is the May issue of the Bureau of Labor Statistics' Employment and Earnings for the year following the survey year. Figures from March of each survey year are used. This table supplies unemployment rates for each State and for selected metropolitan statistical areas. Respondents who reside within one of these metropolitan statistical areas are assigned the appropriate unemployment rate. For those residing outside of these areas, a "balance of State" unemployment figure is computed using State total figures for the size of the civilian labor force and the number employed and subtracting the population living in metropolitan statistical areas.
    • Additional information on these variables can be found in Appendix 7 in the NLSY79 Geocode Codebook Supplement.

Types of County or Metropolitan Statistical Area Environmental Characteristics in the NLSY79 restricted-use Geocode data:

  • Population sizes
  • Percent of population that is:
    • urban
    • black
    • female
    • under 5 years old
    • 65+ years old
  • Birth/death/marriage/divorce rates
  • Physician and hospital bed rates
  • Crime rates
  • Poverty level data
  • Educational attainment levels 
  • Median family and per capita income
  • Recipients of and payments from:
    • AFDC
    • SSI
    • Social Security
  • Labor force statistics:
    • total labor force
    • civilian labor force
    • number of females in the civilian labor force
    • civilians unemployed versus employed
    • percent employed in various industries
  • Unemployment rate for labor market of residence

Geographic Residence: Detailed geographic mobility information was collected during the 1979-80, 1982, and from 2000 forward; data were gathered on the country/county/State and timing of up to five residential moves since January 1978 or since the last interview. Beginning in 2000 only significant geographical moves were recorded.

Neighborhood Quality: The neighborhood quality series (1992, and 1994-2000), is taken from the National Commission on Children Parent & Child Study, 1990 Parent Questionnaire. In this series of questions respondents rate how much of a neighborhood problem issues such as crime, lack of police protection, unsupervised children and joblessness are.

Other Geographic Variables: Users may obtain special permission to use zip code and Census tract data available at the BLS offices in Washington, DC.

Edited versus Unedited Versions of State/County of Residence: For some years (1979-82, 1988-89, 1991-92), two versions of the State and county of residence variables have been included in the "Geocode xxxx" files. The set occurring at the beginning of each file is the edited version, while the variables found near the end of the files for these years are unedited. If the variable has an actual source question number/name, it is the original from NORC. If the source question name says created, it is the edited/created version. Note that the unedited variables are sometimes combined into a single variable, with the State and county code appended to each other. These raw variables are preceded by the word "GEOCODE" in the variable title. The edited residence variables contain the corrections made for erroneous address information and are the ones from which the Geocode files themselves are constructed. Users should be aware that the edited version of these variables does not contain data for those respondents who are in the active military forces or who are living abroad or in a U.S. territory.  Codes of "-4" appearing in the unedited versions of the State or county variables (because foreign country and U.S. territory codes are placed in one field or the other) should not appear in the edited versions of these residence variables.

New Geocode Procedures for Assigning Residence Codes and Hand-Editing Discrepant Cases: During the 1988 hand-editing process, it became evident that the telephone numbers were very accurate, even in cases for which the address information contained discrepancies. Beginning in 1989, the area code and phone exchange were used to assign State and county of residence codes. The State assigned by the area code was then compared to the State assigned on the basis of zip code alone and the State contained in the original NORC respondent file. A "quality of match" variable was computed on the basis of how well these States match. For a more detailed discussion of these new assignment and matching procedures, refer to Appendix 10: Geocode Documentation in the Geocode Codebook Supplement. This process was used through the 1994 release.

The hand-editing procedure has also been streamlined. In 1989, the first year in which the phone assignment procedure was used, the residence codes assigned on the basis of the area code and exchange were compared to the raw residence variables received from NORC. Those with information that did not match were identified for individual examination. Ideally, the discrepancies requiring individual examination would be reduced to those cases which are "genuine movers" or which have zip codes covering multiple counties and would require some verification that the correct county was assigned based upon the phone information. The current process for identifying discrepancies and hand-editing is aimed more directly at achieving this objective. 

Beginning in 1990, the residence codes assigned based on phone information were compared to the 1989 CHRR-edited residence information to identify cases for individual examination. Because the previous year's edited variables incorporate the corrections that were made in the hand-editing process from earlier years, repeated editing of the same cases across years decreased. Through this process, the discrepancies in residential Geocode information were reduced. The number of cases requiring individual examination also decreased and was restricted more closely to the population of "genuine movers" and people with multiple-county zip codes and phone numbers that require verification of county of residence. 

The hand-editing process in previous years included not only these genuine movers and multi-county zip code dwellers, but also other cases for which elements of the address are simply in error or incompatible with each other. Some of these cases could potentially require editing for the same errors in more than one year, even if the respondent stayed in one location. Hand-editing procedures were further streamlined, and in some cases automated, to produce the 1992 data.

Beginning in 1996, a new procedure for verifying and assigning correct final Geocode information was instituted. This procedure is now performed using specialized address tracking Geocode software. The processes are described in Appendix 10. It is the belief of CHRR staff members not only that the current procedures are more efficient in identifying true discrepancies and streamlining the hand-editing process, but also that they result in more accurate and consistent assignment of State and county codes in general. 

Missing Values, New England Cases, and Mobility: Missing values in location of residence variables and metropolitan statistical area codes are associated with respondents who are in the active military forces or who are living abroad or in a U.S. territory. Users should be aware that, because the New England County Metropolitan Area (NECMA) codes are not comparable to metropolitan statistical areas from the remainder of the country, New England cases are eliminated from some of the procedures used to construct the Geocode files.

The review and hand-editing process has been periodically revised to improve the accuracy of the data and the efficiency of data production. The potential implications for effects on mobility rates between some years due to these changes have been noted in Appendix 10: Geocode Documentation. Users should read Appendix 10 carefully to gain a better understanding of the issues outlined above and their implications for specific research endeavors.

Comparison to Other NLS Cohorts: Data on the respondent's area of residence are available for all cohorts. Geographic residence information for those NLSY79 children who resided with their mother can be inferred from the residence data of their mothers. The NLSY97 main created variables indicate whether the respondent lives in an urban or rural area, whether the respondent lives in a Metropolitan Statistical Area, and in which Census region the respondent resides. More detailed information is available on the restricted-use Geocode data. Region of residence and geographic mobility of Original Cohort respondents are provided for most survey years. Geographic data for NLSY79 respondents fall into two categories: information on the main public file and more detailed information released as restricted-use Geocode data. These confidential files are available for use only at the BLS National Office in Washington, DC, and at Federal Statistical Research Data Centers (FSRDCs) on statistical research projects approved by BLS. Access to data is subject to the availability of space and resources. Information about applying to use the zip code and Census tract data is available on the BLS Restricted Data Access page.

Survey Instruments and Documentation

Data on residence at birth and at age 14, as well as the 1979-82 present/most recent residence series, were collected using questions found within Section 1 ("Family Background" and "On Family") of the 1979, 1980, and 1982 questionnaires. All other variables are created from or determined by the geographic information provided by each NLSY79 respondent within the locator section of the questionnaire or from the interviewing Face Sheet or internal NORC locating files. Several attachments and appendices in the NLSY79 Codebook Supplement and/or the NLSY79 Geocode Codebook Supplement offer creation procedure information and coding systems for the geographic residence variables. The following are relevant to the Geocode:

Areas of Interest Residence variables can be found within the "Family Background," "Key Variables," "Geocode xxxx," or "Misc. xxxx" areas of interest; the table above specifies the particular areas of interest for each variable. All environmental variables, including the 'Unemployment Rate for the Labor Market of Current Residence,' are present in the "Geocode xxxx" areas of interest in the restricted-use Geocode data.