Errata for 1979-2016 Data Release

National Longitudinal Survey of Youth - 1979 Cohort

Errata for 1979-2016 Data Release

Newest Errata [posted 5/31/2019]

Review of Imputations of NLSY79 Asset Variables 1985-2012

An internal review of the imputed versions of the NLSY79 asset variables from 1985-2012 is currently underway. While computing the new series of Total Net Family Wealth (TNFW_TRUNC) variables, some anomalies were discovered in a subset of imputations.

On review, it appears that the OLS method used for the current imputations can result in unexpected or smaller- or larger-than-expected variances between imputed values and their bounding values. In addition, some imputations were found to have been generated for unbounded survey years (1985 – the first year in which detailed asset information was collected which lacks a previous bounding year, and; 2012 – which was the latest year of collection before the current release and lacked a subsequent year).

While the review is being undertaken, the imputed assets variables have been removed from the current public data release. In addition, the existing NET_WORTH_[YR] variables, which incorporated imputed values in the calculation of a families total worth have also been removed. A new set of variables named TNFW_TRUNC (Total Net Family Wealth) replaces the former NET_WORTH_[YR] variables. TNFW_TRUNC calculations do not incorporate imputed values. Users can still compute imputations using existing data methods of their choosing in accordance with their own research requirements.

Updates about these variables will be posted on the Errata page. More information on assets and debt variables as well as computations of the Total Net Family Wealth variables can be found in Appendix 23: Revised Asset and Debt and Computed Total Net Wealth Variables. References to the imputed and NET_WORTH_[YR] variables have been removed to accurate reflect the content of the current public release data.

Other Errata

NLSY79 Political Attitude Questions Containing “0” Codes In 2008 [posted 5/22/2019]

Malfunctioning dynamic response categories very early in the NLSY79 2008 (round 23) field period resulted in data for several political attitude questions containing improper “0” codes. Upon review, CHRR staff were able to determine correct codes for roughly 50 cases across these variables. All other “0” codes in these questions that are not documented in the list of response categories should be coded to “-3.” The corrected variables can be accessed in the file political_attitudes_2008.zip. Corrections will be reflected in the next NLSY79 public data release.

Case Data Deletion [posted 5/7/2019]

After the release of the 2016 data, the case data for respondent id 7645 were determined to be invalid for survey years 2012, 2014, and 2016. The survey data for this respondent for those survey years will be removed from the data at the time of the next data release. In the meantime, users should avoid using data for this respondent for survey years 2012, 2014, and 2016.

Dual Job Variables with Incorrect Areas of Interest [posted 5/6/2019]

Dual Job variables for dual jobs #2, #3 and #4 that occurred during the period covered by the round 27 (2016) interview were inadvertently assigned to Area of Interest WORK HISTORY – DUAL JOB 1. Question names for the incorrectly assigned variables end in “_NUM2/3/4” and variable titles begin with “JOB NUMBER 2/3/4.” The affected reference numbers and their correct Area of Interest assignments are listed below. The mislabeling will be corrected with the next data release.

W13868.00 – W13950.00           WORK HISTORY – DUAL JOB 2

W13951.00 – W14007.00           WORK HISTORY – DUAL JOB 3

W14008.00 – W14045.00           WORK HISTORY – DUAL JOB 4

Data for Respondent Interviewed under Wrong CASEID in 2006 Removed [posted 4/9/2019]

It was recently discovered that limited data for a respondent interviewed under caseid #4646 in 2006 inadvertently remained in previous public data releases. These data have been removed for survey year 2006 (round 22) and for a small number of created variables. The sampling weights and Reason for Non-interview variables for 2006 have also been adjusted. There should be little to essentially no effect on weights for remaining respondents. The correct number of non-interviews for the 2006 survey has increased by 1 to 5033. In addition, changes have been made where necessary to the Work History arrays (STATUS, HOURS and DUAL JOB arrays), Recipiency month data and Employers_all roster items. The number of interviews for 2006 have decreased by 1 to 7653 (decreasing the number of completed interviews to 7653).

Adjustments made to data for respondent #4646 fall into several categories:

  • All data for this respondent has been eliminated for survey year 2006 with the exception of SAMPWEIGHT, C_SAMPWEIGHT and RNI. These values have been corrected to reflect a 2006 non-interview status. In addition, certain variables in the XRND survey year category that pertain to 2006 have been deleted as well.
  • Some values have been changed to eliminate the cumulative presence of data erroneously collected for #4646, including some variables in 2016.
  • Variables in relevant data arrays/rosters has been adjusted where necessary to eliminate erroneous data and incorporate retrospective data provided by the correct respondent in 2016. These arrays include the Work History arrays, Recipiency month variables and the Employers_all roster.

The corrections discussed above are reflected in the current public data release. Users can contact User Services for further information.

The correct respondent #4646 was located for the 2016 interview and provided some retrospective data. 

Missing Values in Question Loops Added to Current Release [posted 4/9/2019]

Some values for assets collected in repetitive question loops were inadvertently omitted in past releases for survey years 1998-2014. These question loops contained values that did not require topcoding. Asset values reported in repetitive question loops can be identified by the following characteristics:

  • They are assigned to area of interest ASSETS;
  • Question names have a number extension (e.g. .01, .02, etc.);
  • For documentation consistency, question names contain the string “TRUNC”, whether or not they are truncated/topcoded.

Variables containing topcoded values include the string “TRUNC” or “TRUNCATED” in the variable title. Un-topcoded values from loops that were added to the current release do not include those strings in the variable title. Asset values from all repetitive question loops are included in the current public release data.

Uncorrectable Data Errors

Legal Form of Business Not Collected for 31 Cases in 2012 (posted 1/23/2015)

Due to an error in the questionnaire, the legal form of a business (SES-BUSOWN-12.#) for 31 NLSY79 respondents was not collected in 2012. This error affects the respondents who reported a business in 2012 that matches a job reported during the last interview and who were last interviewed in 2008 and prior. The legal form of a business for these 31 cases should be coded as -3.

The IDs for these 31 respondents can be found in the following file: legalform12_invalidmissing.xlsx.

Since in 2012 we did not re-ask the legal form of a business that matched a job reported in the last interview, users wishing to determine the legal form of that business should use the employer number from the previous survey year variable in 2012 (EMPLOYER_EMPPREVID.#). For example, if EMPLOYER_EMPPREVID.# is 1, which means that the business is the same as job #1 in 2010, then legal form of that business in 2012 is the value of SES-BUSOWN-12.01 or BUSOWN-12.01 in 2010. If the business in 2012 is job #2 in 2010, then the legal form of that business is the value of SES-BUSOWN-12.02 or BUSOWN-12.02 in 2010.

Missing Occupation, Industry and Class of Worker in 1994 data items

The occupation, industry and class of worker information for 353 CPS employers were not collected during the 1994 interview. These CPS employers were either less than 9 weeks in duration since the last interview, or were employers for whom the respondent worked less than 10 hours per week. They were erroneously treated as other non-CPS employers with those characteristics, for which occupation, industry and class of worker information is not collected. For those employers that were also reported in the previous survey year, and for which the respondent confirmed that his/her occupation did not change since the previous survey year, the occupation, industry and class of worker codes from the previous survey year should also apply. Users may also data subsequent survey years in a similar manner to attempt to fill in more of this information.

This error is present on all current NLSY79 data releases.

Missing information on Union Affiliation/Collective Bargaining in 1994 data items

Due to an error in the questionnaire, information on union affiliation and collective bargaining on a number of employers was not collected. Respondents reporting a non-self-employed job should have answered these questions. This error affects employer #1 (generally the CPS employer) for 3,210 respondents of the 7141 respondents who should have been asked, employer #2 for 531 of the 2215 respondents who should have been asked, employer #3 for 128 of the 606 who should have been asked, employer #4 for 34 of 168 who should have been asked and employer #5 for 6 of 48 who should have been asked. This is 45% missing for employer #1, 24% missing for employer #2, 21% missing for employer #3, 20% missing for employer #4 and 13% missing for employer #5.

Conversely, information on union affiliation and collective bargaining was collected on a number of self-employed respondents, for whom these questions should not have been asked. This error affects employer #1 for 166 cases, employer #2 for 45 cases, and employers #3, #4 and #5 for 1 case each. This information for self-employed respondents (those with a code of "4" for class of worker) should be disregarded.

This error is present on all current NLSY79 data releases.

2 missing cases in 1994 data items

Due to probable machine glitches, the data from two (2) apparently completed interviews was rendered inaccessible. 1994 variables for cases #5078 and #10524 are missing. Any 1994 data items remaining for these cases is meaningless and should be discarded for purposes of analysis. The 1996 interview period for these cases spanned from the 1993 to the 1996 interview. Information that would have been collected at the 1994 interview is thus now included in the data for the 1996 survey year.

This data error is present on all current NLSY79 data releases.

NORC 1978 Memo

The 1978 NORC memo regarding race and ethnicity coding can be found here: NORC 1978 memo