Errata for NLSY97 Round 19 Release
Errata for NLSY97 Round 19 Release
NEWEST ERRATA
Duplicate Job IDs in the NLSY97 [posted 4/7/2022, updated 11/29/2023]
In response to various user questions concerning this topic, NLS archivists are currently reviewing all occurrences in which the same job uid appears on the YEMP roster more than one time in the same interview round. The list of all of these occurrences can be found in the file nlsy97jobswithsameuids.xlsx. When the archivists complete the review of these cases, they expect to find that these cases will fall into three distinct categories: 1) The jobs are the same but the respondent reported working at them multiple times during the same interview period; 2) The jobs are in fact distinct and should be assigned separate unique ids; 3) The jobs are the same, and the information collected about these jobs was duplicated. Once the review is done, we will post an errata that includes the type of change and the affected variables.
Following the review of duplicate jobs, updates were made to these cases using the methods listed below:
Category 1: These jobs include both duplicate job IDs with spells that don’t overlap as well as those that overlap in some weeks. For these cases, the YEMP_UID remains on the YEMP_ roster, and all interview and event history data remain intact. In the survey round in which the duplicated jobs appear, variables relating to tenure (CV_WKSWK_JOB_DLI, CV_WKSWK_JOB_YR) were updated to reflect total tenure (including both spells) rather than the number of weeks by spell. In subsequent rounds, the job-specific tenure variables reflect the total amount of weeks worked. Additional checks were made and, when necessary, updates were made to the CVC_TTL_JOB_YR_ALL and CVC_TTL_JOB_ADULT2/TEEN2 variables.
Category 2: Six jobs were found to have an incorrect job ID and were updated to the correct job ID in the employer roster (YEMP_UID), and all variables associated with both jobs were updated (PUBID= 246, 3156, 4528, 4955, 5868, and 8371). This potentially included information in the EMP_STATUS and EMP_DUAL arrays along with the associated job-specific tenure measures (CV_WKSWK_JOB_DLI, CV_WKSWK_JOB_YR) in that and subsequent rounds. When necessary, updates were made to the CVC_TTL_JOB_YR_ALL and CVC_TTL_JOB_ADULT2/TEEN2 variables.
Category 3: Sixteen jobs that overlapped completely were deleted as the respondent appeared to re-report the exact job twice (PUBID = 1374, 1586, 2742, 2947, 3132, 3449, 5641, 5644, 5649, 5888, 6325, 6348, 6851, 6859, 7217, and 9011). For these jobs, the extra job was deleted from interview data (YEMP-), the employer roster (YEMP_). Information about the repeated job in the EMP_DUAL job arrays and the EMP_HOUR array was updated to remove the job. All information for the extra job was deleted in the affected round (CV_HRLY_COMPENSATION, CV_HRLY_PAY, CV_HRS_PER_WEEK, CV_JOB_13_WKS, CV_WKSWK_JOB_DLI, CV_WKSWK_JOB_YR) and updated in the combined variables (CVC_HOURS_WK_YR) along with successive rounds where if the job occurred (CV_WKSWK_JOB_DLI, CV_WKSWK_JOB_YR). Additional checks were made and, when necessary, updates were made to the CVC_TTL_JOB_YR_ALL and CVC_TTL_JOB_ADULT2/TEEN2 variables.
As part of this process, staff also conducted a more comprehensive tracing of job ID values and their link to the actual job name as reported over time. This review has identified a total of 75 cases in which some form of data correction was necessary. These cases are listed in the file NLSY97 YEMP_roster_changes_2023.xlsx.
The types of data correction are as follows. First, in a total 30 cases, a technical error caused jobs that were continuing from previous rounds to be inadvertently assigned a new job ID. Second, in 25 cases, the respondent failed to correctly report a job as a previously held job. In both of these cases, the correct job ID was assigned, and the corresponding data and created variables were updated in the same manner as described for Category 2 above.
Third, in two cases, a false job was reported that needed to be deleted. Fourth, in 18 cases, a duplication of job data was detected within the same survey year. The duplicate job roster data and employment section data for these cases have now been removed. The process of updating the data for deleted cases is the same as described above for Category 3.
These corrections will be included in the NLSY97 round 20 public release data.
OTHER ERRATA
Corrections to NLSY97 Schooling Roster and Event History Data [posted 11/27/2023]
NLSY97 archival staff have identified a problem that affects colleges that appear on the NEWSCHOOL roster. The NEWSCHOOL_INTERVIEW.xx and NEWSCHOOL_PUBID.xx variables are intended, respectively, to identify the round in which a particular school is first reported and then to assign a permanent public ID to the school within each respondent's enrollment history. This allows public data users to determine whether a respondent is enrolled at a school that was first reported in an earlier round, or whether a respondent has enrolled in a new school. For 320 respondents, we have discovered that some colleges which appear to be newly reported colleges are in fact schools that were originally reported in an earlier round. A list of PUBIDs and years affected can be found here: CollegeIDchanges_2023.xlsx.
In order to correct this problem, the original interview round and public id have been reassigned to these schools wherever they appear in later rounds. These corrections also affect the created event history array based on the college attended SCH_COLLEGE_ID_year.xx and the associated term SCH_COLLEGE_TERM_year.xx and degree SCH_COLLEGE_DEGREE_year.xx. Corrections have been made in all rounds of data from 1999 through 2019.
This review also identified a small number of cases in which the SCH_COLLEGE_TERM array was repeated and needs to be corrected (PUBID= 584, 1182, 1200, 2552, 3365, 3497, 5677, 6179, 6341, 6589, 6773, 7281, 7537, 8096, and 8917). Additionally, updated information was provided for cases in which the respondent has reported an ‘other specify’ answer for the degree pursued (PUBID=61, 1081, 2779, 5710, 6239, 6674, 7105, 7460, and 8684); these answers were later coded into actual degrees. For these cases, the SCH_COLLEGE_STATUS array was updated and the SCH_COLLEGE_TERM, SCH_COLLEGE_ID, SCH_COLLEGE_DEGREE arrays were added to the data.
All updates will be included in the NLSY97 round 20 public release data.
COVID_ITEM_MODE [posted 10/23/2023]
The COVID_ITEM_MODE variable will be added to the upcoming release. This variable determines the mode of interview based on the individual check items administered mechanically during the survey. The data can be found here: COVID_ITEM_MODE.xlsx
Corrections to the Created Incarceration Variables and Event Histories [posted 10/23/2023]
Updates have been made for 15 respondents in the incarceration arrays. These updates apply to respondents with 4 or more arrests that complete an incarceration prior to the interview date and have a subsequent incarceration in the same round. The later incarceration was not included in the program. Updates were made to the following variables: INCARC_AGE_FIRST, INCARC_FIRST, INCARC_LENGTH_FIRST, INCARC_LENGTH_LONGEST, INCARC_TOTMONTHS, INCARC_TOTNUM and various INCARC_STATUS variables between 2003.05 and 2019.09. The affected respondents are the following: 3127, 4044, 4937, 5066, 5092, 5286, 5404, 5760, 5893, 7375, 7483, 8571, 8607, 8761, and 8843. These variables will be made available on the upcoming public release.
NLSY97 CV_MSA Coding Error in Round 18 and Round 19 [posted 5/22/2023]
Respondents who were not in the country were miscoded in the CV_MSA variable. These respondents were assigned a valid skip (-4) rather than the established category for not in the country (code=5). All of the respondents with a valid skip for these rounds should be recoded to the value of 5. This will be updated on the next release.
Income and Assets Review [posted 11/4/2022]
The NLS program has conducted a review of income and assets topcoding and standardized these values. Adjustments have been made to a very limited number of values to reflect the current standards.
Missing 1998 Machine Check Variables [posted 10/25/2022]
A set of machine check freelance job variables from 1998 has been inadvertently omitted from the public release. The variables can now be downloaded from yemp112800.zip until they are made available on the next public release.
Errata for Custom Weighting Program [posted 3/25/2022]
An error in the NLSY97 custom weighting program was discovered and corrected on March 10, 2022. The program application allowed an option to reclassify all members in the sample as cross-sectional. A program error caused this option to be exercised which reassigned all oversample user-entered IDs to be redefined as cross-sectional. As a result, the program generated weights without accounting for cross-sectional versus oversample status. The created variables contained on the public release are unaffected as are yearly weights generated from the Weight Years page in the Custom Weighting program.
Corrections in Some Employer ID, Created, and Event History Variables [posted 2/25/2022]
A review of employer unique id variables has indicated that for three respondents, we have duplicated ids representing unique jobs in various rounds of data. This problem affects the various YEMP_UID and related XWALKID variables across a number of rounds. Additional updates were made to the associated created and event history variables: EMP_STATUS_year.xx, EMP_DUAL2/3_year.xx, EMP_BK_WKS_year, EMP_BY_STATUS_year, EMP_BK_HOURS_year, EMP_START_YEAR_year, and CV_WKSWK_JOB_DLI. The corrected values for the affected cases are listed below.
PUBID QNAME YEAR RESPONSE
3343 DLI_XWALKID.02 2009 200901
8819 DLI_XWALKID.02 2009 200901
3343 YEMP_UID.01 2010 200901
3343 YEMP_UID.03 2010 200902
8819 YEMP_UID.01 2010 200902
8819 YEMP_UID.02 2010 200901
4007 YEMP_UID.01 2010 201001
4007 NEWEMP_XWALKID.01 2010 201001
3343 YEMP_UID.02 2011 200901
8819 YEMP_UID.02 2011 200902
8819 YEMP_UID.01 2011 200901
4007 YEMP_UID.02 2011 201001
8819 YEMP_UID.01 2013 200901
8819 YEMP_UID.01 2015 200901
8819 YEMP_UID.02 2017 200901
3343 CV_WKSWK_JOB_DLI.01 2010 85
4007 CV_WKSWK_JOB_DLI.01 2010 27
8819 CV_WKSWK_JOB_DLI.01 2010 102
8819 CV_WKSWK_JOB_DLI.02 2010 102
3343 CV_WKSWK_JOB_DLI.03 2010 51
3343 CV_WKSWK_JOB_DLI.02 2011 100
4007 CV_WKSWK_JOB_DLI.01 2011 261
4007 CV_WKSWK_JOB_DLI.02 2011 53
8819 CV_WKSWK_JOB_DLI.01 2011 151
8819 CV_WKSWK_JOB_DLI.02 2011 146
8819 CV_WKSWK_JOB_DLI.01 2013 263
8819 CV_WKSWK_JOB_DLI.01 2015 366
8819 CV_WKSWK_JOB_DLI.02 2017 398
8819 EMP_DUAL_2_2015.44 through EMP_DUAL_2_2015.52 XRND 200901
8819 EMP_DUAL_2_2016.01 through EMP_DUAL_2_2016.23 XRND 200901
8819 EMP_STATUS_2013.45 through EMP_STATUS_2013.52 XRND 200901
8819 EMP_STATUS_2014.01 through EMP_STATUS_2014.52 XRND 200901
8819 EMP_STATUS_2015.01 through EMP_STATUS_2015.43 XRND 200901
8819 EMP_STATUS_2011.38 through EMP_STATUS_2011.53 XRND 200901
8819 EMP_STATUS_2012.01 through EMP_STATUS_2012.52 XRND 200901
8819 EMP_STATUS_2013.01 through EMP_STATUS_2013.44 XRND 200901
3343 EMP_DUAL_2_2010.50 through EMP_DUAL_2_2010.52 XRND 200901
3343 EMP_DUAL_2_2011.01 through EMP_DUAL_2_2011.12 XRND 200901
4007 EMP_DUAL_2_2010.45 through EMP_DUAL_2_2010.52 XRND 201001
4007 EMP_DUAL_2_2011.01 through EMP_DUAL_2_2011.18 XRND 201001
8819 EMP_DUAL_2_2010.41 through EMP_DUAL_2_2010.52 XRND 200902
8819 EMP_DUAL_2_2011.01 through EMP_DUAL_2_2011.32 XRND 200902
8819 EMP_STATUS_2010.41 through EMP_STATUS_2010.52 XRND 200901
8819 EMP_STATUS_2011.01 through EMP_STATUS_2011.37 XRND 200901
8819 EMP_DUAL_2_2009.41 through EMP_DUAL_2_2009.52 XRND 200901
3343 EMP_DUAL_2_2009.47 through EMP_DUAL_2_2009.52 XRND 200902
3343 EMP_DUAL_2_2010.01 through EMP_DUAL_2_2010.14 XRND 200902
8819 EMP_DUAL_2_2010.01 through EMP_DUAL_2_2010.40 XRND 200901
3343 EMP_DUAL_3_2010.15 XRND 200902
3343 EMP_START_WEEK_2010.01 XRND 47
8819 EMP_START_WEEK_2010.01 XRND 41
8819 EMP_START_WEEK_2010.02 XRND 41
3343 EMP_START_WEEK_2010.03 XRND 47
3343 EMP_START_YEAR_2010.01 XRND 2009
8819 EMP_START_YEAR_2010.01 XRND 2009
8819 EMP_START_YEAR_2010.02 XRND 2009
3343 EMP_START_YEAR_2010.03 XRND 2009
8819 EMP_STATUS_2009.41 through EMP_STATUS_2009.52 XRND 200902
3343 EMP_STATUS_2009.47 through EMP_STATUS_2009.52 XRND 200901
3343 EMP_STATUS_2010.01 through EMP_STATUS_2010.49 XRND 200901
8819 EMP_STATUS_2010.01 through EMP_STATUS_2010.40 XRND 200902
4007 EMP_STATUS_2010.18 through EMP_STATUS_2010.44 XRND 201001
3343 EMP_BK_STATUS_2010 XRND -4
3343 EMP_BK_WKS_2010 XRND -4
3343 EMP_BK_HOURS_2010 XRND -4
8819 EMP_BK_STATUS_2010 XRND -4
8819 EMP_BK_WKS_2010 XRND -4
8819 EMP_BK_HOURS_2010 XRND -4
Missing Employment Variable [posted 1/26/2022]
As part of the 2021 data release, a large set of variables that were used to determine job type was released to the public for the first time. Unfortunately, one of these variables, the 2005 version of YEMP-9899WDZ, which collects information about on-call status, was inadvertently left off of the release. This variable will be included as part of the next public release. Until then, it can be accessed from the following link: yemp-9899wdz_2005.zip.
Updates to College Event History variables [posted 12/29/2021]
Due to a programming error, the imputation for cases where a respondent reported a valid start date but invalid stop date for an enrollment period was not implemented. This only affected data from Round 18 and Round 19 reports. A total of 54 cases are affected. The corrections are attached (college_errata_data.xlsx) and the data will be updated at the next release.
Missing Assets 35 Variables [posted 12/10/2021]
Seven variables that are part of the combined NLSY97 Assets 35 section were inadvertently omitted from the most recent public release. The complete Investigator tagset for these variables can be downloaded from the following link: yast35missingvars.zip.
Updates to Marital Status and Associated Event History Variables [posted 11/30/2021]
The round-specific created variables (CV_MARSTAT and CV_MARSTAT_COLLAPSED) and associated event history variables were updated in the Round 19 release to reflect respondent corrections to marital and cohabitation status histories. The affected PUBIDs and rounds are listed below:
PUBID |
CV_MARSTAT/MARSTAT_COLLAPSED |
32 |
12 - 17 |
72 |
18 |
920 |
11, 17, 18 |
1217 |
18 |
1745 |
18 |
1946 |
18 |
2339 |
17 |
3823 |
17 |
3859 |
16 |
4237 |
17 |
4366 |
14, 16, 17 |
4745 |
14 - 16 |
4941 |
12 - 16 |
5272 |
16, 17 |
5952 |
16, 17 |
6072 |
17 |
6801 |
7 - 10 |
6978 |
11 - 17 |
7021 |
18 |
7092 |
18 |
7228 |
13, 17 |
7269 |
13 - 17 |
7439 |
18 |
7795 |
8 - 9 |
8357 |
18 |
8650 |
18 |
8732 |
17 - 18 |
In addition, an examination of partners accumulated across multiple rounds indicated that a small number of partners who at first appeared to be unique individuals were in fact the same person. Updates made to the Round 19 release include those to the PARTNERS roster along with the event history and created variables to account for these mis-reports. The affected PUBIDs are listed below:
18 |
1206 |
1941 |
2902 |
3778 |
5352 |
5892 |
6956 |
7839 |
8438 |
191 |
1266 |
1950 |
2903 |
3796 |
5417 |
5990 |
7050 |
7946 |
8641 |
277 |
1397 |
2127 |
2904 |
3921 |
5426 |
6185 |
7147 |
7949 |
8668 |
524 |
1414 |
2297 |
3531 |
4252 |
5448 |
6247 |
7320 |
7964 |
8720 |
549 |
1637 |
2306 |
3565 |
4594 |
5462 |
6479 |
7404 |
8052 |
8743 |
655 |
1684 |
2419 |
3596 |
4624 |
5660 |
6564 |
7582 |
8163 |
8756 |
912 |
1705 |
2576 |
3654 |
4671 |
5727 |
6841 |
7596 |
8215 |
8785 |
980 |
1754 |
2652 |
3656 |
5225 |
5784 |
6876 |
7713 |
8261 |
9018 |
1107 |
1801 |
2810 |
3710 |
5330 |
5803 |
6879 |
7742 |
8346 |
Mismatched Industry and Occupation Codes in Single 2008 Case [posted 11/23/2021]
We have discovered that for a single case in the 2008 data, the industry and occupation codes were not matched to the correct job number for some of this respondent's reported jobs. The corrected industry and occupation codes for these jobs are listed below:
PUBID | VARIABLE | CORRECTED VALUE |
2032 | YEMP_INDCODE-2002.04 | 770 |
2032 | YEMP_INDCODE-2002.05 | 2770 |
2032 | YEMP_INDCODE-2002.06 | 4690 |
2032 | YEMP_INDCODE-2002.07 | 4670 |
2032 | YEMP_INDCODE-2002.08 | 7770 |
2032 | YEMP_INDCODE-2002.09 | -4 |
2032 | YEMP_OCCODE-2002.04 | 6440 |
2032 | YEMP_OCCODE-2002.05 | 8040 |
2032 | YEMP_OCCODE-2002.06 | 7200 |
2032 | YEMP_OCCODE-2002.07 | 7200 |
2032 | YEMP_OCCODE-2002.08 | 4250 |
2032 | YEMP_OCCODE-2002.09 | -4 |