Chapter 2: Sample Design & Fielding Procedures

Return to Table of Contents


2.1 Sample Design & Screening Process

Sampling Procedures

The NLSY97 cohort comprises two independent probability samples:  a cross-sectional sample and an oversample of black and/or Hispanic or Latino respondents.  The cohort was selected using these two samples to meet the survey design requirement of providing sufficient numbers of black and Hispanic or Latino respondents for statistical analysis. 

The NLSY97 cohort was selected in two phases, as pictured in Figure 1.  In the first phase, a list of housing units for the cross-sectional sample and the oversample was derived from two independently selected, stratified multistage area probability samples.  This ensured an accurate representation of different sections of the population defined by race, income, region, and other factors.  In the second phase, subsamples of the eligible persons identified in the first phase were selected for interview.

2.1 Figure 1. Selection of NLSY97 Respondents

The listing of eligible housing units was composed of 96,512 households, defined as a single room or group of rooms intended as separate living quarters for a family, for a group of unrelated persons living together, or for a person living alone.  The list of housing units for each sample was selected in the following manner:  First, 100 primary sampling units (PSUs)[1] for each sample were chosen from NORC's 1990 national sample.  In the cross-sectional sample, each PSU represented either a metropolitan area or one or more non-metropolitan counties with a minimum of 2,000 housing units.  The supplemental sample defined PSUs differently from the cross-sectional sample; counties containing large percentages of minorities were merged to create areas containing a minimum of 2,000 housing units.  Second, regardless of sample, segments containing one or more adjoining blocks-and at least 75 housing units-were selected from each PSU.  Finally, a subset of housing units within the segment comprised the NORC listing of households eligible for interview.

The second phase identified all NLSY97-eligible individuals born between 1980 and 1984 (age 12 to 16 as of December 31, 1996) in each household.  NORC interviewers went to the households and administered a short interview called the simple screener, a portion of the Screener, Household Roster, and Nonresident Roster Questionnaire, which collected the age or date of birth of every person linked to a particular household.  The survey collected these data for more than 150,000 people.  In cross-sectional sampling units, if the household included one or more occupants in the eligible age range, NORC interviewers asked those individuals to participate in the first NLSY97 interview.  In supplemental sampling units, the interviewer continued with the extended screener, which established the race and ethnicity of household members.  If a person of the correct age and of black or Hispanic or Latino race/ethnicity resided in the household, he or she was asked to participate in the survey.  Any person in the above age range who completed the first round interview is considered a member of the NLSY97 cohort.  Base year interviews were conducted between January and early October 1997 and between March and May 1998 (see section 2.2 for details).  Of the 9,806 individuals selected for interview during household screenings, a total of 8,984 (91.6 percent) were interviewed.

During the NLSY97 screening process, two additional nationally representative samples were identified to participate in the administration of the CAT-ASVAB.  The first group, the Student Testing Program (STP), consisted of students who expected to be in the 10th through 12th grades in the fall of 1997.  Included were many respondents who also participated in the main NLSY97 survey, as well as youths who refused to participate in or were not eligible for the NLSY97.  The second sample, the Enlistment Testing Program (ETP), was a nationally representative sample of youths 18 to 23 years old as of June 1, 1997.  This group provided the normative information used by the Department of Defense to determine the score distribution of military-eligible youths and to help assess the impact of these tests on minority and female military eligibility.

 

[1] There are 100 PSUs in the cross-sectional sample and 100 PSUs in the oversample; however, some PSUs were selected in both samples.  Thus, there are a total of 147 non-overlapping PSUs included in the NLSY97.

Cross-Sectional Sample

For the cross-sectional sample, 54,179 screening interviews were carried out among 1,149 sample segments in 100 primary sampling units (PSUs), drawn from the NORC master probability sample of the United States.  The cross-sectional screening established three samples:

  1. Main NLSY97 Sample:  A cross-sectional sample designed to be representative of young people living in the United States during round 1 and born January 1, 1980, through December 31, 1984.  This sample is designed to maximize the statistical efficiency of samples through the several stages of sample selection (counties, enumeration districts, blocks, sample listing units).  Probabilities of selection are based upon total housing units in a geographic area.

    Following the initial screening process, 7,327 individuals from the cross-sectional sample were designated to be interviewed in the NLSY97 survey; of those, 92.1 percent, or 6,748 respondents, completed the round 1 interview.

  2. Department of Defense Student Testing Program (STP) Sample:  A nationally representative sample of students living in the United States during round 1 and born June 2, 1973, through December 31, 1984, who were in grades 9-11 in the spring or summer of 1997, were not enrolled during the spring and summer but expected to be in grades 10-12 in the fall of 1997, or were enrolled in grades 10-12 during the fall of 1997.  (See the "Administration of the CAT-ASVAB" section of this guide for more information.)  Some NLSY97 respondents were also eligible for the STP sample.

  3. Department of Defense Enlistment Testing Program (ETP) Sample:  A cross-sectional sample designed to be representative of the noninstitutionalized segment of young people living in the United States during round 1 and born June 2, 1973, through June 1, 1979. 

Supplemental Sample

Statistically efficient samples of black and Hispanic or Latino respondents were created by oversampling these minorities in 100 PSUs in NORC's national sample.  For the supplemental sample, 21,112 screening interviews were conducted in 599 sample segments.  The supplemental screening produced three samples:

  1. NLSY97 Black and Hispanic or Latino Oversample:  A supplemental sample designed to oversample Hispanic or Latino and black respondents living in the United States during round 1 and born January 1, 1980, through December 31, 1984.  Stratification specifically relevant for Hispanics or Latinos and blacks was used.  Oversample respondents were chosen with a probability based on size measures for these groups rather than for the general population.  This should make it possible to equalize the distribution of the targeted groups among the various sampling units more than would otherwise be the case.

    After screening, 2,479 individuals from the supplemental sample were designated for interview in the NLSY97, and of these, 90.2 percent, or 2,236 respondents, completed the round 1 interview.

  2. Department of Defense STP Sample:  A nationally representative sample of students, selected regardless of race and/or ethnicity, living in the United States during round 1 and born June 2, 1973, through December 31, 1984.  Members of this sample are those who-depending on the time of the household screening-were in grades 9-11 in the spring or summer of 1997, were not enrolled during the spring and summer but expected to be in grades 10-12 in the fall of 1997, or were enrolled in grades 10-12 during the fall of 1997.

  3. Department of Defense ETP Black and Hispanic or Latino Oversample:  A sample of black and Hispanic or Latino youths living in the United States during round 1 and born June 2, 1973, through June 1, 1979.

Data hint

Users can identify whether each respondent was a member of the cross-sectional or supplemental sample type by referring to the sample type variable (CV_SAMPLE_TYPE, R12358.).

2.1 Table 1. NLSY97 Round 1 Interview Completion

Sample

Eligible for interviewing

Interviewed round 1

Total Cohort

9806

8984

91.6%

Cross-Sectional Sample
Supplemental Sample

7327
2479

6748
2236

92.1%
90.2%

Screening Procedures

The screening interview was completed by NORC in 75,291 housing units.  These interviews occurred in 1,748 sample segments of 147 non-overlapping PSUs, including most of the fifty states and the District of Columbia.[1]  The screening interview was designed to elicit information allowing identification of household occupants eligible for inclusion in the NLSY97 sample.  The NLSY97 screening interviews were completed within 94.1 percent of the cross-sectional and 93.1 percent of the supplemental occupied housing units selected for screening.  Table 1 presents a summary of completed interviews in round 1.

Sampling procedures were developed to establish links between housing units in the sample PSUs and individuals who might be temporarily absent.  As part of the screening process, household informants were asked if there were any persons for whom the housing unit was the usual place of residence but who were away from the housing unit at the time of the survey.  Included in this group were college students, persons in the military, and persons in prisons or other institutions.  Sampling procedures were also established for those residing in a selected housing unit whose usual place of residence was elsewhere.  Table 2 lists the NLSY97 status (e.g., included in the sample, excluded, or restricted) for youths not in their usual residence at the time of the survey.


 

[1] There are 100 PSUs in the cross-sectional sample and 100 PSUs in the oversample; however, some PSUs were selected in both samples.  Thus, there are a total of 147 non-overlapping PSUs included in the NLSY97.

2.1 Table 2. NLSY97 Sampling Status of Youths by Housing Arrangement

Housing arrangement

Status

Exchange students

Included if the youth lived in the sample housing unit for at least six months during 1997.

Youths whose temporary residence was a group quarters structure (e.g., prisons, boarding school, college dormitories)

Included if their usual place of residence was in a selected PSU. Excluded otherwise.

Youths whose usual place of residence was not in a selected PSU, but whose temporary residence was within a PSU

Excluded.

Youths in a foreign school

Included.

Youths linked to two or more housing units

If the respondent's mother is alive and her housing unit is in a sample housing unit, the youth is linked there. Otherwise, the youth is linked to the father's housing unit. If neither the mother nor the father is living in a sample housing unit, the youth is linked to one of the sample housing units at random.

Youths who cannot be linked to any other housing unit

Included if the youth is residing at a sample housing unit when the screening interview is conducted.

Siblings:

The NLS sample design, which selected every eligible person connected to the housing unit, generated a sample of siblings living in the same housing unit and satisfying the NLSY97 age restrictions.  However, the NLSY97 samples do not contain nationally representative samples of siblings of all ages and living arrangements.  Care should be used in generalizing from the findings of sibling studies based on the NLSY97.  Table 3 shows the numbers of sibling groups in the NLSY97.

2.1 Table 3. Round 1 Distribution of NLSY97 Sibling Groups

Type

Respondents

No Siblings

5129

Total Multiple Siblings

3855

  2 Siblings

3134

  3 Siblings 627
  4 Siblings 84
  5 Siblings 10
Total 8984
Note: Table based on the household ID code (R11930.) and the
relationship variables from the round 1 household roster
(HHI2_RELx.xx). Siblings include biological, adoptive, half-, and
step- relationships but not foster relationships.

Other technical information on the sample assignment process can be found in (1) the Field Interviewer Reference Manual, which includes a copy of the screening instrument, and (2) the Technical Sampling Report, which describes the NLSY97 sample selection procedures for both subsamples.  Both of these documents are available at www.bls.gov/nls.

Return to top


2.2 Interview Methods

This section first discusses the data collection methods used for the core round 1 survey instruments: the Screener, Household Roster, and Nonresident Roster Questionnaire; the Youth Questionnaire; and the Parent Questionnaire. Following this overview, the section briefly describes interview administration in subsequent survey rounds. The content of these instruments is described in section 1.4, "Content of the NLSY97."  Finally, the section discusses supplemental NLSY97 studies, including school surveys, transcript surveys, and the CAT-ASVAB.  

Users should note that respondents have received $10 for their participation in rounds 1-3, and responding parents received $10 when they completed the round 1 interview. In round 4, survey administrators offered different levels of incentives to respondents in an effort to study the effects of incentive level on survey participation. Three levels of compensation were offered: $10, $15, and $20.  In addition, half of the respondents at each level were paid in advance and half were paid upon completion of the interview. Both the level and the timing of the compensation are included in the variable PAYINCENT, found in the round 4 data.  In rounds 5 and 6 all respondents received $20.  In round 7, respondents who had not completed the Round 6 interview were eligible for an incentive experiment (R7_INCENTIVE), where respondents in the experimental group were offered an additional $5 for each consecutive round in which they had not participated (up to a maximum of an additional $15), while respondents in the control group were offered the standard $20 incentive.

The field periods have differed somewhat across rounds.  Table 1 indicates when the first several rounds were fielded, along with the total response rate.

2.2 Table 1. NLSY97 Sample Sizes, Retention Rates, and Fielding Periods

Round

Fielding period

Cross-sectional sample

Supplemental sample

Total sample

Total

Retention rate

Total

Retention rate

Total

Retention rate

1

February-October 1997 and March-May 1998

6748

--

2236

--

8984

--

2

October 1998-April 1999

6279

93.0

2107

94.2

8386

93.3

3

October 1999-April 2000

6172

91.5

2036

91.1

8208

91.4

4

November 2000-May 2001

6054

89.7

2026

90.6

8080

89.9

5

November 2001-May 2002

5918

87.7

1964

87.8

7882

87.7

6 November 2002-July 2003 5898 87.4 1998 89.4 7896 87.9
7 October 2003-July 2004 5782 85.7 1972 88.2 7754 86.3
8 October 2004-July 2005 5600 83.0 1902 85.1 7502 83.5
9 October 2005-July 2006 5437 80.6 1901 85.0 7338 81.7

Note: Retention rate is defined as the percentage of base year respondents remaining eligible who were interviewed in a given survey year;
deceased respondents are included in the calculations.

Round 1 Interview Methods

Fielding Period: Most round 1 NLSY97 interviews were conducted between January and early October 1997. Due to concerns about the number of eligible youths found during the initial field period, investigators decided to conduct a refielding between March and May 1998. During this second part of the initial survey round, 395 additional respondents were interviewed. These respondents were administered the same instrument as those initially interviewed in 1997. See section 2.3 for more information about the composition of the NLSY97 sample.  

Data hint

Respondents selected for the NLSY97 sample during the refielding are identified by the refielding symbol (CV_REFIELD_YOUTH).

Researchers analyzing topics where time periods are critical should carefully examine the reference period of the questions and the actual interview date for individual respondents. In particular, the round 1 fielding period has implications for questions on education; see section 4.2.2, "Educational Status & Attainment," for more information.

Researchers should also pay close attention to the elapsed time between interviews for each respondent. While the time between the first and second interviews was about 18 months for most respondents, it may be less for those first interviewed during the refielding period.

Data hint

The respondent's interview date for each round can be identified by using a set of three created variables: (1) CV_INTERVIEW_DATE_D, (2) CV_INTERVIEW_DATE_M, (3) and CV_INTERVIEW_DATE_Y. To determine the date of the previous interview, researchers should first identify the round when the respondent was last interview (see SYMBOL!ROUND), then pick up the corresponding date of interview for that round.

 

Screener, Household Roster, and Nonresident Roster Questionnaire

Choice of household informant:  To identify youths potentially eligible for the NLSY97, the screener collected data from selected households within a sample area.  A single member of the household, designated as the household informant, was asked to provide certain information on persons who usually resided in the household.  To ensure more accurate reporting of these data, the NLSY97 required the household informant to be age 18 or older and to consider the selected household his or her usual place of residence. 

Computer-Assisted Personal Interview (CAPI):  After a household informant was chosen to complete the Screener, Household Roster, and Nonresident Roster Questionnaire, interviewers used a CAPI system to collect data.  Computer software automatically guided interviewers through an electronic questionnaire, selecting the next question based on a respondent's answers.  The program also prevented interviewers from entering invalid values and warned interviewers about implausible answers.  A set of checks within the CAPI system lowered the probability of inconsistent data both during an interview and over time. To ensure that accurate data were collected from Spanish-speaking respondents, CHRR prepared both English and Spanish versions of all survey instruments, and NORC employed bilingual Spanish-speaking interviewers to administer the Spanish version to those requesting it.  During the initial round, the Spanish version of the questionnaire was requested by 297 responding parents and 96 NLSY97 youths.

Screen and Go:  In round 1, use of the computer-assisted personal interviewing system (CAPI) allowed for a screen and go method of screening households.  When an NLSY97-eligible youth was identified in the simple screener portion of the interview, information from the remainder of the Screener, Household Roster, and Nonresident Roster Questionnaire was collected.  Selected data (e.g., basic demographic information, a roster of household members) were then transferred automatically into the Parent and Youth Questionnaires for verification and use during the interview.  Therefore, the interviewer could administer the parent or the youth portion of the NLSY97 immediately.  It was expected that this would increase the likelihood that eligible youths would participate in the survey since the number of visits interviewers had to make to a household decreased.

However, in some cases, the respondents (parent and youth) were not available to participate in the parent and youth interviews immediately after screening.  In these cases, a screen and come back method was utilized, in which the interviewer made an appointment to return to the household to administer the Youth and Parent Questionnaires at a convenient time.

Paper Screener:  During round 1, the interviewers had the option of using a paper screener to perform the initial screening of the household.  The paper screener collected the same basic information as the initial CAPI screener.  This was useful in cases where the simple screener information could not be collected using CAPI (e.g., weather conditions, computer battery life, dangerous neighborhood) and also gave the interviewer an alternative medium for collecting the initial screener data.  Like the screen and go model, the paper screener was designed to determine if anyone residing in the housing unit was eligible for either the NLSY97 or the administration of the CAT-ASVAB.  If a youth was identified as being potentially eligible for the NLSY97, the information from the paper screener was entered into CAPI.  The interviewer could then continue in CAPI with the Screener, Household Roster, and Nonresident Roster Questionnaire and the Youth and Parent Questionnaires.  Approximately 28,000 paper screeners were administered, including those used for the screen and come back method described above.

Proxy Screener:  In cases where a round 1 interviewer made several visits to a household and still could not contact household members to administer the initial screener, a proxy screener was administered to an adult living either next door to or directly across from the selected housing unit.  Before the interviewer could administer a proxy screener, at least three attempts were made by the interviewer, on different days and at different times, to contact anyone in the selected housing unit.

The purpose of the proxy screener, a paper questionnaire, was to assess whether a person eligible for the NLSY97 resided in the household.  In particular, the proxy screener was designed to determine the best time to establish contact with a household member, whether or not a person between the ages of 8 and 28 currently lived in the household, and the steps required to contact a household member.  The broad 8-28 age range was intended to ensure that youths close to the endpoints of the actual age range were not missed due to inaccurate reporting.  If the proxy screener indicated that none of the household members were in the age range of 8 to 28, the screener was coded as a proxy screener and no more attempts were made to contact the household.  However, if the proxy informant was unable to definitively deny the presence of residents ages 8-28, the interviewer was instructed to return as many times as reasonable and necessary to administer the simple screener and, if appropriate, the remainder of the survey instruments.  A total of 5,175 proxy screeners determined that no one between ages 8 and 28 lived in the household.

Gatekeepers:  The gatekeeper disposition code was used in cases where the interviewer could not gain direct access to the sample household, such as a high-rise building with a locked door where access was denied by a building manager or a gated housing community where the entry guard refused entrance.  In these cases, the interviewer asked the gatekeeper or other community official whether anyone between the ages of 8 and 28 lived in the sample households.  If the gatekeeper was unable to definitively deny the presence of household members ages 8-28, the interviewer then attempted to gain access to the household in order to complete the Screener, Household Roster, and Nonresident Roster Questionnaire and was not permitted to use this disposition code.  A total of 4,055 cases were closed with a gatekeeper disposition code after the interviewer determined that no one between ages 8 and 28 lived in the household.  This code was mainly used in gated housing communities for senior citizens.

Telephone Screener:  In rare cases, the simple screener was conducted by telephone at the conclusion of the field period.  A total of 931 telephone screeners were administered.  Instances in which the housing unit was contacted by telephone include:

  1. The proxy screener revealed a person between the ages of 8 and 28 living in the household and the interviewer was unable to contact anyone in the housing unit on three subsequent in-person visits; or

  2. The interviewer made three in-person visits but was unable to find a neighbor to whom he or she could administer the proxy screener.

The full Screener, Household Roster, and Nonresident Roster Questionnaire was also administered by telephone in rare instances.  Situations in which the full instrument was conducted by telephone include:

  1. After completing the paper screener, the interviewer was unable to contact anyone in the housing unit to complete the full extended screener.  At least three in-person contacts must have been attempted before the telephone contact was approved.

  2. The sample housing unit was inside a residential community to which the interviewer was barred access by the community (e.g., housing board authority).  Prior to the telephone interview, the correct person must have been contacted about gaining access at least three times (in person, by telephone, or by letter).

NLSY97 Parent Questionnaire and Youth Questionnaire

When the Screener, Household Roster, and Nonresident Roster Questionnaire was complete, any NLSY97-eligible youth(s) and one of the youth's parents (the responding parent) were interviewed using CAPI.  Prior to these interviews, selected data (e.g., basic demographic information, a roster of household members) were automatically transferred into the Parent Questionnaire and the Youth Questionnaire for verification and use during the interviews.  Consequently, the interviewer was able to administer the parent or the youth portion of the NLSY97 immediately.  CAPI interviews were conducted in either English or Spanish; parent and youth respondents could choose either version.

Data hint

In round 1, the NLSY97 youth respondent(s) and responding parent(s) in the household are listed on the household roster, but they are referred to as "Household Member #" in the same way as noninterviewed household members. The youth respondent's position on the household roster can be identified by using the variable YOUTH_HHID.01. The responding parent's position on the roster is provided in PARYOUTH_PARENTID. See section 4.6.5, "Household Composition," for further discussion of the structure and use of the household roster.

Choice of Parent:  One parent of each respondent was asked to participate in the parent interview.  This parent was identified during the household roster portion of the survey.  The responding parent (or guardian) was asked for extensive background information, including marital and employment histories.  He or she was also asked to answer questions about the family in general, as well as to provide information about aspects of his or her (NLSY97-eligible) children's lives.

The choice of the preferred responding parent was based on the pre-ordered list in Figure 1.  For example, a biological mother was chosen before a biological father, and so forth.  However, in some cases a parent figure lower on the list was chosen if a parent higher on the list was in the household but was not available at the time of the interview.  If the youth did not live with a parent-type figure, or lived with a guardian or parent not listed, no parent was interviewed; the youth's record will not contain any data from the Parent Questionnaire.  Users should note that the records of some youths who do live with a listed parent or parent-figure do not contain any data from the Parent Questionnaire due to nonresponse.

2.2 Figure 1. Priority for Choosing Responding Parent

1

Biological mother 8 Foster parent, youth lived with for 2 or more years
2 Biological father 9 Other non-relative, youth lived with for 2 or more years
3 Adoptive mother 10 Mother-figure, relative
4 Adoptive father 11 Father-figure, relative
5 Stepmother 12 Mother-figure, non-relative youth lived with for 2 or more years
6 Stepfather 13 Father-figure, non-relative youth lived with for 2 or more years
7 Guardian, relative    

Interviews are available with 6,124 parents; 7,942 youth respondents have information available from a parent interview. Table 2 shows the number of respondents by age who had a parent participate in the round 1 survey.

2.2 Table 2. NLSY97 Youths by Age and Parent Interview Availability

Age (birth year)

Total number of youths

Youths with a parent interview

12 (1984)

1771

1583 (89.4%)

13 (1983) 1807

1615 (89.4%)

14 (1982) 1841 1595 (86.6%)
15 (1981) 1874 1668 (89.0%)
16 (1980) 1691 1481 (87.6%)

Total

8984

7942 (88.4%)

Note: Table based on R05367. and R07359.

In multiple respondent households, more than one parent may have been interviewed during round 1 if the selection criteria above indicated different parents for different NLSY97-eligible youths in the household.  For example, if a couple residing in a sample household each had an NLSY97-eligible youth from a previous marriage, the biological parent of each youth would be interviewed.  The survey first collected parent-specific information from each parent and then asked for information about the NLSY97-eligible youth matched to that parent.  In this example, each parent would be asked to provide youth-specific information for his or her NLSY97-eligible biological child.

Due to a computer programming error, however, both parents in some multiple respondent households were asked to provide youth-specific information only for the oldest NLSY97-eligible youth(s) living in the household.  In the example above, both parents would be asked to give information about the older of the two children.  In these infrequent instances, the correct parent-specific information is matched to each youth, but one or more youths in the household do not have any youth-specific information.  This programming error was corrected during the survey period and affected only 33 youth cases.

Audio Computer-Assisted Self-Interview (ACASI):  The parent and youth portions of the NLSY97 survey used an audio computer-assisted self-interview (ACASI) to obtain potentially sensitive information.  The respondent was able to listen to the questions with earphones or turn off the audio and read the questionnaire from the computer screen.  Compared to traditional paper-and-pencil self-administered sections, the computerized version permits more complex questionnaire structuring, and the audio component theoretically improves response quality when the respondent's literacy is in question.  As with the interviewer-administered instruments, the ACASI was available in Spanish or English.

User Notes: Each NLSY97 questionnaire includes an interviewer remarks section, which interviewers complete after finishing the interview with the respondent. This section records objective information about the interview, such as the presence of another person during the survey, where the interview took place, and the language in which the questionnaire was administered. Interviewers are also asked to provide an overall assessment of the interview. See section 5.3, "Interviewer Remarks," for more details.

Rounds 2-9 Interview Methods

Fielding Periods: The round 2 survey was conducted from October 1998 through April 1999.  Most respondents were surveyed approximately 18 months after their first interview, although the elapsed time between interviews is substantially less for some respondents.  Refer to 2.2 Table 1 for fielding dates for all rounds.

Locating respondents is a coordinated effort of NORC's central office, locating shop, and local-level field staff.  Prior to fielding, NORC's central office sends a short, informative "locator letter" to each respondent reminding him or her of the upcoming interview and confirming the respondent's current address and phone number.  During the field period, field interviewers use contact information to track down hard-to-find respondents, while central office staff assist with database searches and other centralized locating methods.

Youth Questionnaire:  As in round 1, the interviews are conducted each round using a CAPI instrument, administered by an interviewer with a laptop computer.  The preferred mode of interview is in person.  When an interview is conducted in person, during sensitive portions of the interview, the respondents enter their answers directly into the laptop rather than interacting with the interviewer.  This self-administered portion, called ACASI, includes an audio option so that the respondents can listen to the questions and answers being read via headphones if they prefer.  In some cases, due to the location of the respondent or the respondents' reluctance to be interviewed in person, interviews are conducted by phone.  In this case the interviewer must administer the SAQ sections.  Table 3 shows the number of in-person and telephone interviews for each round. 

2.2 Table 3. NLSY97 interview mode

Year

Personal

Telephone Not Available Interviewed

Not
interviewed

Round 1

8700

96.8% 284 3.2% 0 -- 8984 -- --

--

Round 2 7924 94.5 460 5.5 2 1 8386 93.3 598

6.7%

Round 3 7552 92.0 655 8.0 1 1 8208 91.4 776 8.6
Round 4 7372 91.2 706 8.7 2 1 8080 89.9 904 10.1
Round 5 7215 91.5 664 8.4 2 1 7882 87.7 1102 12.3
Round 6 6614 83.8 1281 16.2 1 -- 7896 87.9 1088 12.1
Round 7 6825 88.0 927 12.0 2 1 7754 86.3 1230 13.7
Round 8 6577 87.7 925 12.3 2 1 7502 83.5 1482 16.5
Round 9 6348 86.5 989 13.5 1 1 7338 81.7 1646 18.3

NOTE: Table created using the variable YIR-560. Telephone was mode of interview for 223 round 1 parent interviews.

1Less than 0.05%.

Household Income Update:  In rounds 2-5, this brief questionnaire collected basic income information from one of the respondent's parents (usually the parent who signed the youth's interview consent form).  All respondents who lived with a parent were eligible for this questionnaire, regardless of age or other criteria for independence.  The parent answered these questions on a self-administered paper instrument.  Interviewers then entered the data into a computer-assisted questionnaire on their laptops and attached the information to the records of all NLSY97 youths in the household.  Additional quality control checks were performed in the central office, where hard copy questionnaires were reviewed against the coded data.  In round 2, parents of 7,601 respondents answered at least one question from the Household Income Update.  Parents of 5,488 respondents answered at least one question in round 3, and 5,225 parents of respondents answered at least one question in round 4.  Parents of 4,090 respondents answered at least one question in round 5. Beginning in round 6, all respondents were at least 18 years old, so the Household Income Update is no longer administered.

Supplemental NLSY97 Studies

School Survey (1996).  Designed with an emphasis on the school-to-work transition, round 1 of the NLSY97 also included a mail survey of schools.  Principals (or their proxies) were asked to complete a self-administered instrument that focused on institutional-level attributes such as school policies and management as well as student-level "experience" data.  See section 4.2.5, "School & Transcript Surveys," for more detail about the content of the survey.

Schools in the NLSY97 sample areas that had a 12th grade comprised the sample for this survey.  As depicted in Figure 1 in section 2.1 of this chapter, the NLSY97 sample was drawn from 147 primary sampling units (PSUs).[1]  The PSUs were further divided into sample segments.  All schools in any county with a segment selected for NLSY97 sampling were included in the survey.  There were some counties in the PSUs from which no sample segments were selected.  The 1996 survey did not include schools in these counties.  Schools were identified using the Quality Education Data (QED) file, a proprietary national database of primary and secondary schools in the United States.

The original school survey form was mailed in September 1996; in-scope schools that did not respond by December 1996 were sent a shorter version of the survey, the "critical items" questionnaire.  Of the 7,390 in-scope schools that received the survey, 5,295 responded to either the original school survey or the critical items questionnaire.  The response rate by the end of the field period, April 5, 1997, was 71.6 percent.

Answer forms for the original school survey were electronically scanned by NORC.  However, some hand editing was necessary.  The majority of the edited questions were in decimal format.  To ensure clean data, the answers were verified by randomly selecting cases, keying the data, and comparing the keyed data files against the scanned data files.  The critical items questionnaire did not use a scannable format; the data were keyed using Computer Assisted Data Entry (CADE) and verified twice.

School Survey (2000).  Round 3 of the NLSY97 also included a repeat survey of schools.  Principals (or their proxies) were asked to complete a self-administered instrument similar to that used in 1996.  To reduce the time burden, questionnaire items from the 1996 instrument were modified to encourage respondents to provide approximate values rather than requiring them to consult administrative records for exact figures.  See section 4.2.5, "School & Transcript Surveys," for more detail about the content of the survey.

As in 1996, schools in the NLSY97 PSUs that had a 12th grade were mailed survey instruments.  However, the 2000 sample was expanded to include vocational schools.  The sample also included schools in the counties that were in NLSY97 PSUs but did not include any sample segments.  Schools in these counties had been omitted from the 1996 survey but were included for limited data collection in 2000.  No telephone follow-up was done for schools in these "omitted counties."  Finally, in addition to the geographically based sample, other schools were included if an NLSY97 respondent was enrolled during round 2 and that school met the grade and program requirements for eligibility.  Schools were identified using the 1998 Quality Education Data (QED) file. 

By January 2000, survey staff had secured cooperation from state school officers and local school districts.  In February 2000, questionnaires were mailed to 9,632 sampled schools, including 8,925 schools in a longitudinal sample (comparable to the 1996 school survey), 492 in the omitted counties sample, and 215 eligible only due to round 2 youth enrollment.  After mail and telephone follow-up, 5,955 schools (71.6 percent) in the longitudinal sample (comparable to the 1996 school survey) completed questionnaires.  The overall response rate for all schools in the 2000 survey was 71 percent.

Due to "births" and "deaths" of schools between 1996 and 2000 and nonresponse in 1996, not all schools in the longitudinal sample are present in the 1996 data.  The retention rate of 1996 schools into the 2000 survey was 74.2 percent (3,900 of 5,253).

Transcript Survey.  At two separate points in time, the NLSY97 program sought high school transcripts for respondents who were no longer enrolled in high school and for whom field interviewers had secured parent and respondent consent for transcript release.  Eligible respondents were those who either had graduated from high school or who were age 18 or older and no longer enrolled in high school.  The first wave of transcripts was collected in 1999-2000, the second wave in 2004.  Transcripts were received and processed for 1,417 respondents in Wave 1 and 4,815 respondents in Wave 2 for a combined total of 6,232 respondents.  Using course catalogs, transcript data, and clarification calls to school administrators, survey staff constructed histories of courses taken and term enrollment calendars for each youth.  Data files also include information on absences, standardized test scores, and indicators of special education, gifted/talented, and high school graduation status.  Courses were coded into the Revised Secondary School Taxonomy (SST-R).  Public use data are available on the round 7 main release.

CAT-ASVAB:  From summer 1997 through spring 1998, most NLSY97 respondents were administered the computer adaptive version of the Armed Services Vocational Aptitude Battery (CAT-ASVAB).  See section 4.1.2, "Administration of the CAT-ASVAB," for more information.


 

[1] There are 100 PSUs in the cross-sectional sample and 100 PSUs in the oversample; however, some PSUs were selected in both samples.  Thus, a total of 147 non-overlapping PSUs are included in the NLSY97.

 

Return to top


2.3 Sample Size & Composition

For more information about the representativeness of the sample members, users should consult the NLSY97 Technical Sampling Report (2000).  Although fewer age-eligible youths than expected were found during the household screenings, no correlation has been identified between education, income, area of residence, etc., and participation in the survey. 

Of the youths eligible for interview in the first round, 8,984 were actually interviewed.  Table 1 illustrates the racial, ethnic, and gender composition of the initial sample and the respondents participating in subsequent rounds.

User Notes: The initial NLSY97 data release contained records for 9,022 respondents. However, an evaluation of the round 1 data revealed that 38 of these respondents either were not age-eligible for the cohort or were duplicates. The records of these out-of-scope respondents have been removed from the data, and numbers in this guide have been updated to reflect the new sample size of 8,984 respondents. Identification numbers of dropped respondents are included in the round 1 NLSY97 Codebook Supplement and are available from NLS User Services.

2.3 Table 1. Racial, Ethnic & Gender Composition of NLSY97 Sample

Gender

Race/Ethnicity

Black

Hispanic or Latino

Non-black/
non-Hispanic

Mixed

Total

Round 1

Male
Female
Total

1169
1166
2335

977
924
1901

2413
2252
4665

40
43
83

4599
4385
8984

Round 2

Male
Female
Total

1103
1101
2204

904
868
1772

2238
2095
4333

38
39
77

4283
4103
8386

Round 3

Male
Female
Total

1062
1071
2133

875
853
1728

2193
2076
4269

39
39
78

4169
4039
8208

Round 4

Male
Female
Total

1065
1059
2124

861
837
1698

2153
2027
4180

37
41
78

4116
3964
8080

Round 5

Male
Female
Total

996
1036
2032

846
828
1674

2110
1991
4101

36
39
75

3988
3894
7882

Round 6

Male
Female
Total

1033
1054
2087

845
834
1679

2083
1973
4056

36
38
74

3997
3899
7896

Round 7

Male
Female
Total

1015
1046
2061

817
825
1642

2060
1916
3976

36
39
75

3928
3826
7754

Round 8

Male
Female
Total

939
1054
1993

793
811
1604

1966
1866
3832

34
39
73

3732
3770
7502

Round 9

Male
Female
Total

947
1034
1981

776
782
1558

1907
1823
3730

33
36
69

3663
3675
7338

Note: Table based on KEY!RACE_ETHNICITY (R14826.), KEY!SEX (R05363.), and RNI (R25102., R38277., etc.).

Return to top


2.4 Retention and Reasons for Noninterview

After the initial survey round, some sample members do not respond to one or more subsequent interviews. Table 1 shows the retention rates by sample type for rounds 2-9 of the NLSY97.

2.4 Table 1. Retention Rates by Sample Type and Gender

 

Cross-sectional

Supplemental

Sample Total

Interviewed

Retention rate

Interviewed

Retention rate

Interviewed

Retention rate

Round 2

Male

Female

Total

3213

3066

6279

92.9%

93.2

93.0

1070

1037

2107

93.9%

94.6

94.2

4283

4103

8386

93.1%

93.6

93.3

Round 3

Male

Female

Total

3144

3029

6173

90.9

92.1

91.5

1026

1010

2036

90.0

92.2

91.1

4170

4039

8209

90.7

92.1

91.4

Round 4

Male

Female

Total

3097

2957

6054

89.6

89.9

89.7

1019

1007

2026

89.4

91.9

90.6

4116

3964

8080

89.5

90.4

89.9

Round 5

Male

Female

Total

3011

2907

5918

87.1

88.4

87.7

977

987

1964

85.7

90.1

87.8

3988

3894

7882

86.7

88.8

87.7

Round 6

Male

Female

Total

2995

2903

5898

86.6

88.3

87.4

1002

996

1998

87.9

91.0

89.4

3997

3899

7896

86.9

88.9

87.9

Round 7

Male

Female

Total

2951

2831

5782

85.3

86.1

85.7

977

996

1972

85.7

90.1

88.2

3928

3826

7754

85.4

87.3

86.3

Round 8

Male

Female

Total

2816

2784

5600

81.4

84.7

83.0

916

986

1902

80.4

90.1

85.1

3732

3771

7502

81.2

86.0

83.5

Round 9

Male

Female

Total

2731

2706

5437

78.9

82.3

80.1

932

969

1901

81.7

88.4

85.0

3663

3675

7338

79.6

83.8

81.7

Note: Table based on RNI (R25102., R38277., etc.), KEY!SEX (R05363.), and CV_SAMPLE_TYPE (R12358.). Retention rate is
defined as the percentage of all base-year respondents participating in a given survey. Deceased respondents are included in the calculations.

For each respondent who is not interviewed in a given round, NORC personnel assign a reason for noninterview code, contained in the variable RNI. Tables 2-4 summarize the reasons for noninterview among NLSY97 respondents during rounds 2-9.

2.4 Table 2. Reason for Noninterview by Gender

Reason for noninterview

Deceased

Not locatable

Technical problem

R too ill

R unavailable

Refused interview

Other

Total

Round 2 total

Male

Female

7

3

4

104

52

52

6

3

3

6

3

3

42

22

20

428

229

199

5

4

1

598

316

282

Round 3 total

Male

Female

16

7

9

193

108

85

2

2

--

1

1

--

51

34

17

510

275

235

3

3

--

776

430

346

Round 4 total

Male

Female

15

6

9

173

88

85

6

--

6

6

2

4

80

53

27

612

326

286

12

8

4

904

483

421

Round 5 total

Male

Female

25

14

11

279

153

126

--

--

--

1

--

1

77

57

20

718

386

332

2

1

1

1102

611

491

Round 6 total

Male

Female

30

17

13

253

144

109

--

--

--

4

3

1

4

3

1

765

410

355

32

25

7

1088

602

487

Round 7 total

Male

Female

37

23

14

256

156

100

--

--

--

2

2

0

26

18

8

858

437

421

51

36

15

1230

672

558

Round 8 total

Male

Female

45

25

20

277

175

102

--

--

--

6

4

2

208

182

26

898

461

437

48

20

28

1482

867

615

Round 9 total

Male

Female

59

32

27

418

259

159

--

--

--

5

5

0

92

80

12

999

522

477

73

35

38

1646

933

713

Note: Table based on RNI (R25102., R38277., etc.) and KEY!SEX (R05363.).

2.4 Table 3. Reason for Noninterview by Sample Type

Reason for noninterview

Deceased

Not locatable

Technical problem

R too ill

R unavailable

Refused interview

Other

Total

Round 2 total

Cross-sectional

Supplemental

7

6

1

104

63

41

6

3

3

6

6

--

42

37

5

428

350

78

5

4

1

598

469

129

Round 3 total

Cross-sectional

Supplemental

16

13

3

193

122

71

2

2

--

1

1

--

51

35

16

510

400

110

3

3

--

776

576

200

Round 4 total

Cross-sectional

Supplemental

15

12

3

173

107

66

6

5

1

6

5

1

80

61

19

612

496

116

12

8

4

904

694

210

Round 5 total

Cross-sectional

Supplemental

25

19

6

279

172

107

--

--

--

1

1

--

77

53

24

718

583

135

2

2

--

1102

830

272

Round 6 total

Cross-sectional

Supplemental

30

23

7

253

162

91

--

--

--

4

4

--

4

1

3

765

637

128

32

23

9

1088

850

238

Round 7 total

Cross-sectional

Supplemental

37

27

10

256

162

94

--

--

--

2

2

--

26

23

3

858

715

143

51

37

14

1230

966

264

Round 8 total

Cross-sectional

Supplemental

45

35

10

277

192

85

--

--

--

6

6

--

208

124

152

898

746

152

48

45

3

1482

1148

334

Round 9 total

Cross-sectional

Supplemental

59

45

14

418

307

111

--

--

--

5

3

2

92

59

33

999

827

172

73

70

3

1646

1311

335

Note: Table based on RNI (R25102., R38277., etc.) and CV_SAMPLE_TYPE (R12358.).

2.4 Table 4. Reason for Noninterview by Race/Ethnicity

Reason for noninterview

Deceased

Not locatable

Technical problem

R too ill

R unavailable

Refused interview

Other

Total

Round 2 total

Non-black/non-Hisp.

Black

Hispanic or Latino

Mixed

7

2

4

1

--

104

22

39

40

3

6

2

--

4

--

6

3

1

2

--

42

22

8

11

1

428

278

79

69

2

5

3

--

2

--

598

332

131

129

6

Round 3 total

Non-black/non-Hisp.

Black

Hispanic or Latino

Mixed

16

8

6

2

--

193

65

59

68

1

2

1

--

1

--

1

1

--

--

--

51

23

13

14

1

510

297

123

87

3

3

1

1

1

--

776

396

202

173

5

Round 4 total

Non-black/non-Hisp.

Black

Hispanic or Latino

Mixed

15

6

8

1

--

173

61

44

67

1

6

1

3

1

1

6

5

1

--

--

80

33

21

26

--

612

375

128

106

3

12

4

6

2

--

904

485

211

203

5

Round 5 total

Non-black/non-Hisp.

Black

Hispanic or Latino

Mixed

25

9

14

2

--

279

100

77

97

5

--

--

--

--

--

1

1

--

--

--

77

35

23

19

--

718

417

189

109

3

2

2

--

--

--

1102

564

303

227

8

Round 6 total

Non-black/non-Hisp.

Black

Hispanic or Latino

Mixed

30

11

16

3

--

253

89

68

95

1

--

--

--

--

--

4

2

--

2

--

4

1

2

1

--

765

490

152

115

8

32

16

10

6

--

1088

609

248

222

9

Round 7 total

Non-black/non-Hisp.

Black

Hispanic or Latino

Mixed

37

14

19

4

--

256

92

76

86

2

--

--

--

--

--

2

1

--

1

--

26

16

4

6

--