4.18 Geographic Residence & Environmental Characteristics

Geographic data for NLSY79 respondents fall into two categories:  information on the main public file and more detailed information released on a restricted-access Geocode CD.  Researchers interested in obtaining this Geocode CD must submit an application to BLS and agree to meet security requirements.  For more information about the Geocode application process, see http://www.bls.gov/nls/geocodeapp.htm.  This section first describes the main file variables and then discusses the data available in the Geocode file.  Table 4.18.1 lists NLSY79 geographic variables along with their areas of interest; variables in the “Geocode xxxx” area are located on the restricted-use CD and all other variables are available on the main public file.

Main file geographic variables

Variables created for each survey year include the following (see the Geocode discussion below for more information on variable creation procedures): 

·         Region of residence at birth, age 14, and survey dates (Northeast, North Central, South, or West)

·         Information on whether the current residence is in an urban or rural county

·          Information on whether the current residence is in a Metropolitan Statistical Area (MSA), the central city of an MSA, or outside of an MSA

·         Beginning in 1988, whether the current residence is in the United States

Related NLSY79 main file variables discussed in the  "Household Composition" and "Family Background" sections of this guide include (1) type of residence or dwelling unit at the time of interview (such as dorm, hospital, jail, orphanage, own home, and so forth) and (2) childhood living arrangements of NLSY79 respondents from birth to age 18, including not only information on persons with whom the respondent lived (such as biological versus adoptive and step-parents) but also on institutions such as children’s homes, group care homes, or detention centers/jails/prisons in which he or she may have resided.

User Notes:  The “Misc. xxxx” areas of interest contain a set of variables titled ‘Does R Live on a Farm or in a Rural Area?  The interviewer answers this question based on observation when at the respondent’s permanent residence; if the interview takes place elsewhere, the interviewer asks the respondent about the place of residence.  There are no consistent criteria for the definition of nonfarm property as rural.  These variables should not be considered a replacement for the created KEY variable, ‘Current Residence Urban/Rural?’  Using the County & City Data Book or Geocode software, the KEY variable is based on proportions of urban and rural populations in the county of residence.

Table 4.18.1 Select Residence Variables by Survey Year and Area of Interest: NLSY79 Main & Geocode Files

Variables

Survey Year(s)

Area of Interest

Documentation

Residence at Birth

 

 

 

Country - U.S. or Other Country

1979, 1983

Geocode 1979

Country - Actual Other Country

1979

Geocode 1979

Attachment 101

Countyxe “County” \i

1979

Geocode 1979

Attachment 102

Statexe “State” \i

1979

Geocode 1979

Attachment 102

South/Non-South

1979

Family Background

Attachment 100

Residence at Age 14

 

 

 

Country - U.S. or Other Country

1979

Geocode 1979

Country - Actual Other Country

1979

Geocode 1979

Attachment 101

County

1979

Geocode 1979

Attachment 102

State

1979

Geocode 1979

Attachment 102

South/Non-South

1979

Family Background

Attachment 100

Area of Residence - Urban/Rural

1979

Family Background

User's Guide &App. 6

Present Residence

 

 

 

Lived in Since Birth

1979

Family Background

Year of Move toxe “Geographic mobility” \i

1979

Family Background

Most Recent Residence

 

 

 

5th-1st Country/County/State Since Jan. 1978

1979

Geocode 1979

 Attachment 101
Attachment 102

Month/Year of Move(s)

1979

Family Background

5th-1st Country/County/State Since Last Int.

1980

Geocode 1980

Attachment 101
Attachment 102

Month/Year of Move(s)

1980

Family Background

Attachment 102

9th-1st Country/County/State Since 1980 Int.

1982

Geocode 1982

Attachment 101
Attachment 102

Month/Year of Move(s)

1982

Family Background

Current Residence

 

 

 

Region

1979–2006

Key Variables

Attachment 100

Urban/Rural

1979–2006

Key Variables

App. 6 & User's Guide

SMSA/Central City

1979–2006

Key Variables

App. 6 & User's Guide

In U.S.

1988–2006

Misc. xxxx

NLSY79 User’s Guide

County

1979–2006

Geocode xxxx

Attachment 102

State

1979–2006

Geocode xxxx

Attachment 102

SMSA

1979–2006

Geocode xxxx

Attachment 104

PMSA

1983–2006

Geocode xxxx

Attachment 104

MSA

1983–2006

Geocode xxxx

Attachment 104

CMSA

1983–2006

Geocode xxxx

Attachment 104

MSA/CMSA/NECMA

1988–2006

Geocode xxxx

Appendix 10

Geocode file variables

Based on address information reported by respondents at the time of the interview, survey staff identify the State, county, and metropolitan statistical area of residence for each respondent.  This information is the basis for the geographic residence variables in the main data set and on the Geocode CD.  Similar information is provided for the respondent’s residence at birth and at age 14.  Additionally, the current residence variables are merged with information from several other data files, namely the City Reference File (Census 1973, 1982, 1983, 1987, 1992) and the County & City Data Book (Census 1972, 1977, 1983, 1988, 1994), to provide detailed information on the environmental characteristics of the State, county, and metropolitan statistical areas in which each NLSY79 respondent resides. 

SMSA

Standard Metropolitan Statistical Area

MSA

Metropolitan Statistical Area

CMSA

Consolidated Metropolitan Statistical Area

PMSA

Primary Metropolitan Statistical Area

NECMA

New England County Metropolitan Area

 

Available since 1988 is the set of variables titled ‘Current Residence in U.S.?’, based on county, or country/territory of residence.  Finally, for select survey years Geocode information is available on the location of respondents’ jobs, the location of colleges attended, and the point of discharge from military service.

Environmental Characteristics:  The types of information depicted in the table below, drawn from the County & City Data Book files (1972, 1977, 1983, 1988, 1994), have been added to the NLSY79 “Geocode” areas of interest.  Variables are available for both the county and metropolitan statistical area of current residence for the 1979–82 survey years and for the county level only for later years.  Users will note that some of these variables are available only for the 1979–82 surveys; the 1983–2006 Geocode files contain a reduced set of variables.

Table 4.18.2 Types of County or Metropolitan Statistical Area Environmental
Characteristics on the NLSY79 Geocode CD

Population sizes

Median family and per capita income

Percent of population that is:

Recipients of and payments from:

·  urban

·  AFDC

·  black

·  SSI

·  female

·  Social Security

·  under 5 years old

Labor force statistics:

·  65+ years old

·  total labor force

Birth/death/marriage/divorce rates

·  civilian labor force

Physician and hospital bed rates

·  number of females in the civilian labor force

Crime rates

·  civilians unemployed versus employed

Poverty level data

·  percent employed in various industries

Educational attainment levels  

Unemployment rate for labor market of residence

Variables on the unemployment rate of each respondent’s labor market of current residence are  available only on the NLSY79 Geocode file.  Additional information on these variables can be found in Appendix 7 in the NLSY79 Codebook Supplement.

The source of the ‘Unemployment Rate’ variables is the May issue of the Bureau of Labor Statistics’ Employment and Earnings for the year following the survey year.  Figures from March of each survey year are used.  This table supplies unemployment rates for each State and for selected metropolitan statistical areas.  Respondents who reside within one of these metropolitan statistical areas are assigned the appropriate unemployment rate.  For those residing outside of these areas, a “balance of State” unemployment figure is computed using State total figures for the size of the civilian labor force and the number employed and subtracting the population living in metropolitan statistical areas.

Neighborhood Quality: The neighborhood quality series (1992, and 1994–2000), is taken from the National Commission on Children Parent & Child Study, 1990 Parent Questionnaire.  In this series of questions respondents rate how much of a neighborhood problem issues such as crime, lack of police protection, unsupervised children and joblessness are.

Other Geographic Variables:  Additional geographic information, available only for use at the Center for Human Resource Research, includes the latitude and longitude of each respondent’s residence.  This information is used as input to computer mapping programs and its usage requires special clearance from the Bureau of Labor Statistics.  Similarly, users may obtain special permission to use zip code and Census tract data available at the BLS offices in Washington, DC.

An additional set of geographic mobility measures is available on the Women’s Support Network File for NLSY79 females interviewed during 1983–85.  Three “across-wave” files present on this supplemental data set compare the extent of matching between female respondents’ own addresses and telephone numbers across the following three survey periods:  1983 to 1984, 1983 to 1985, and 1984 to 1985.  The following types of measures are available:  (1) extent of zip code match (all 5-digit match, first 3-digit match, same State, same subregion, same region, different region); (2) extent of telephone number match (same phone number, same exchange, same area code, same State, same subregion, same region, different region); (3) extent of city/State match (same city, same State, same subregion, same region, different region); and (4) distance of move or separation (same 5-digit zip code, within 50 miles, 51–150 miles, 151–300 miles, 301–600 miles, 601–1000 miles, 1001–1400 miles, 1401–1800 miles, more than 1800 miles).  Those interested in this separate data set should contact User Services to get the special documentation available for these files, as well as ordering information.

Survey Instruments:  Data on residence at birth and at age 14, as well as the 1979–82 present/most recent residence series, were collected using questions found within Section 1 (“Family Background” and “On Family”) of the 1979, 1980, and 1982 questionnaires.  All other variables are created from or determined by the geographic information provided by each NLSY79 respondent within the locator section of the questionnaire or from the interviewing Face Sheet or internal NORC locating files.

Data Files: Residence variables discussed above can be found within the “Family Background,” “Key Variables,” “Geocode xxxx,” or “Misc. xxxx” areas of interest; the first table above specifies the particular areas of interest for each variable.  The level of detail available determines, in general, whether a variable is placed within the restricted release “Geocode xxxx” files or is present within one of the areas of interest on the main data set.  Thus, general country level information, such as whether the respondent resided at various points in time within or outside of the United States, is available to all users with no restriction, while the specific county or SMSA in which he or she resided at a specific interview point is present only within the restricted release Geocode data files.  All environmental variables, including the ‘Unemployment Rate for the Labor Market of Current Residence,’ are present on the restricted release “Geocode xxxx” areas of interest on the Geocode CD.  The collapsed version of the labor market unemployment rate variable is located in the “Key Variables” area of interest on the main data file.

Documentation: Several attachments and appendices in the NLSY79 Codebook Supplement and/or the NLSY79 Geocode Codebook Supplement offer creation procedure information and coding systems for the geographic residence variables.  These appendices and attachments are described in detail in section 3.3 of this guide.  The following are relevant to the Geocode variables:

User Notes:  The coding of respondents’ geographic location before 1993 required extensive hand-editing and is not completely accurate.  The most common error is the potential assignment of a respondent to an adjacent county of residence.  Data on addresses, zip codes, and phone numbers are used to clean the geographic codes.  The post-1988 use of telephone number information improved data quality.  A brief discussion below provides more information on both the hand-edits performed each year and the created variable that indicates the extent of hand-editing required for each case; see Appendix 10 in the Geocode Codebook Supplement for more details.

Additional important information on geographic variables is contained on the following pages.

 

Attaching Other Variables to Existing Geocode Records.  The State and county codes used in constructing the Geocode files are the Federal Information Processing Standards (FIPS) used in the County & City Data Book publications and data files.  Users may attach additional county and metropolitan statistical area-level data from a variety of sources by simply merging information from the desired source with the Geocode data based upon the State, county, and metropolitan statistical area of residence codes in the Geocode file.

Edited versus Unedited Versions of State/County of Residence.  For some years (1979–82, 1988–89, 1991–92), two versions of the State and county of residence variables have been included in the “Geocode xxxx” files.  The set occurring at the beginning of each file is the edited version, while the variables found near the end of the files for these years are unedited.  If the variable has an actual source question number/name, it is the original from NORC.  If the source question name says created, it is the edited/created version.  Note that the unedited variables are sometimes combined into a single variable, with the State and county code appended to each other.  These raw variables are preceded by the word “GEOCODE” in the variable title.  The edited residence variables contain the corrections made for erroneous address information and are the ones from which the Geocode files themselves are constructed.  Users should be aware that the edited version of these variables does not contain data for those respondents who are in the active military forces or who are living abroad or in a U.S. territory.  Codes of “-4” appearing in the unedited versions of the State or county variables (because foreign country and U.S. territory codes are placed in one field or the other) should not appear in the edited versions of these residence variables.

New Geocode Procedures for Assigning Residence Codes and Hand-Editing Discrepant Cases.  During the 1988 hand-editing process, it became evident that the telephone numbers were very accurate, even in cases for which the address information contained discrepancies.  Beginning in 1989, the area code and phone exchange were used to assign State and county of residence codes.  The State assigned by the area code was then compared to the State assigned on the basis of zip code alone and the State contained in the original NORC respondent file.  A “quality of match” variable was computed on the basis of how well these States match.  For a more detailed discussion of these new assignment and matching procedures, refer toAppendix 10: Geocode Documentation” in the Geocode Codebook Supplement. This process was used through the 1994 release.

The hand-editing procedure has also been streamlined.  In 1989, the first year in which the phone assignment procedure was used, the residence codes assigned on the basis of the area code and exchange were compared to the raw residence variables received from NORC.  Those with information that did not match were identified for individual examination.  Ideally, the discrepancies requiring individual examination would be reduced to those cases which are “genuine movers” or which have zip codes covering multiple counties and would require some verification that the correct county was assigned based upon the phone information.  The current process for identifying discrepancies and hand-editing is aimed more directly at achieving this objective. 

Beginning in 1990, the residence codes assigned based on phone information were compared to the 1989 CHRR-edited residence information to identify cases for individual examination.  Because the previous year’s edited variables incorporate the corrections that were made in the hand-editing process from earlier years, repeated editing of the same cases across years decreased.  Through this process, the discrepancies in residential Geocode information were reduced.  The number of cases requiring individual examination also decreased and was restricted more closely to the population of “genuine movers” and people with multiple-county zip codes and phone numbers that require verification of county of residence. 

The hand-editing process in previous years included not only these genuine movers and multi-county zip code dwellers, but also other cases for which elements of the address are simply in error or incompatible with each other.  Some of these cases could potentially require editing for the same errors in more than one year, even if the respondent stayed in one location.  Hand-editing procedures were further streamlined, and in some cases automated, to produce the 1992 data.

Beginning in 1996, a new procedure for verifying and assigning correct final Geocode information was instituted.  This procedure is now performed using specialized address tracking Geocode software.  The processes are described in Appendix 10 of the Geocode Codebook Supplement. It is the belief of CHRR staff members not only that the current procedures are more efficient in identifying true discrepancies and streamlining the hand-editing process, but also that they result in more accurate and consistent assignment of State and county codes in general. 

Missing Values, New England Cases, and Mobility.  Missing values in location of residence variables and metropolitan statistical area codes are associated with respondents who are in the active military forces or who are living abroad or in a U.S. territory.  Users should be aware that, because the New England County Metropolitan Area (NECMA) codes are not comparable to metropolitan statistical areas from the remainder of the country, New England cases are eliminated from some of the procedures used to construct the Geocode files.

The review and hand-editing process has been periodically revised to improve the accuracy of the data and the efficiency of data production.  The potential implications for effects on mobility rates between some years due to these changes have been noted in Appendix 10: Geocode Documentation” of the NLSY79 Geocode Codebook Supplement. Users should read Appendix 10 carefully to gain a better understanding of the issues outlined above and their implications for specific research endeavors.

Comparison to Other NLS Cohorts:  Data on the respondent’s area of residence are available for all cohorts.  Geographic residence information for those NLSY79 children who resided with their mother can be inferred from the residence data of their mothers.  The NLSY97 main created variables indicate whether the respondent lives in an urban or rural area, whether the respondent lives in a Metropolitan Statistical Area, and in which Census region the respondent resides.  More detailed information is available on the restricted-use Geocode CD.  Region of residence and geographic mobility of Original Cohort respondents are provided for most survey years.  For more complete information, consult the BLS website at www.bls.gov/nls or the appropriate cohort’s User’s Guide.

References

Borjas, George; Bronars, Stephen; and Trejo, Stephen J.  “Self-Selection and Internal Migration in the United States.”  NLS Discussion Paper 92-14.  Washington, DC:  U.S. Department of Labor, Bureau of Labor Statistics, 1992.

Falaris, Evanglos.  “Migration and Wages of Young Men.”  Journal of Human Resources 23,4 (Fall 1988):  514–34.

Haurin, Donald and Haurin, R. Jean.  “Net Migration, Unemployment, and the Business Cycle.”  Journal of Regional Science 28,2 (1988): 239–53.

Odland, John and Bailey, Adrian.  “Regional Outmigration Rates and Migration Histories:  A Longitudinal Analysis.”  Geographical Analysis 22,2 (April 1990):  158–70.

Wenk, D. and Hardesty, C.  “The Effects of Residence, Family Background and Household Structure on the Educational Attainment of Young Adults.” In Investing in People:  The Human Capital Needs of Rural America.  L.S. Beaulieu & D. Mulkey, eds.  Westview Press, 1991.


Return to top Return to chapter 4 contents