Household Composition

Household Composition

Organization of Rosters

To organize the information about household residents and nonresident relatives, two rosters were created during the administration of the questionnaire. The first, the household roster, includes information for all current residents of the respondent's permanent household. For easy identification, all variables on the round 1 household roster were assigned question names that begin with "HHI2_."  The second key roster is the nonresident roster, which presents information about the youth's nonresident relatives.  All question names of nonresident roster variables begin with "NONHHI_." Note that all household roster variables from subsequent rounds have names that begin with "HHI_" (the "2" is dropped).

Two additional rosters were then created using household and nonresident data provided by the household informant. These youth and parent rosters modify the household and nonresident rosters with the NLSY97 youth respondent as the focus. For example, an item on both new rosters identifies the line where each NLSY97 youth's biological mother is located. The HHI2 data are not reused because there may be more than one youth and more than one responding parent in a given household, each requiring their own roster (see Sample Design & Screening Process for details about multiple respondent households).

The youth roster includes information specific to the NLSY97-eligible youth, as well as some data collected regarding the youth's parents. Many of the roster items are later verified and corrected if necessary during the youth interview. For example, youths are asked if their age as reported by the household informant is correct, and they provide the correct age if the information is inaccurate. All items on the youth roster have question names beginning with "YOUTH_." Similarly, the parent roster is created using Screener, Household Roster, and Nonresident Roster Questionnaire data about the youth respondent and the responding parent; it is updated during the parent interview. The question names of parent roster items begin with "PARYOUTH_." The parent roster was created only if a parent interview was conducted.

For a pictorial representation of how the rosters described above are created and used, see Figure 2: Creation of Round 1 Rosters Based on Screener Data (PDF).

As stated above, much of the information contained in the rosters may appear in the data set more than once. As Figure 2 suggests, data will first be included at the point in the interview when the information was actually collected. For example, screener question SE-28 asked the household informant for the date of birth of each household member. After all the raw data had been gathered, the computer sorted all the answers and created the rosters described above. If there were errors in the original answers and the youth respondent or responding parent provided corrected information, the roster items were often changed to reflect the most up-to-date information. Additionally, because the data were sorted before the creation of the roster, the ID number listed for a person in the screener questions does not necessarily identify the same person as the ID number in the household roster and elsewhere during the interview. To associate screener information with household roster information, researchers must use variables R10978.-R10993., which provide a crosswalk between the two sets of ID numbers.  This process can be avoided by using the roster items rather than the raw interview data.

Structure of the household roster. The household roster contains the data described above for each household member and organizes it in a matrix form for use by researchers. A key variable in the household roster is the ID number of the household member (HHI2_ID.xx--the ".xx" indicates that this variable is repeated for each household member, beginning with HHI2_ID.01, HHI2_ID.02, and so on). This variable identifies the line number of the household member on the roster. For example, if the NLSY97-eligible youth is listed first on the roster (that is, has a value of 1 for the variable HHI2_ID.01), then all other HHI2 variables that refer to household member 1 contain information about the youth. If the youth's father is second on the roster, or has a value of 2 for the variable HHI2_ID.02, then his information is presented in the HHI2 variables referring to household member 2.

Users should be aware that the ID numbers, or line numbers, were assigned by the computer in a specific order. The household informant reported information about household members in no particular order. After the raw data were collected from the household informant, the computer first identified youths eligible for the NLSY97 and put them at the top of the list of residents. If there was more than one eligible youth, the respondents are listed from oldest to youngest. No household has more than five youth respondents, so no youth respondent has a household ID number higher than 5. After listing the NLSY97-eligible youths, the computer sorted everyone else in the household from oldest to youngest. Therefore, if an older relative such as a grandparent lived in the household, he or she will be listed next, followed in many cases by the youth's parents and then any siblings not eligible for the survey.

The relationship variables on the round 1 household roster provide information about the relationship of every household member to every other household member. For example, consider a household with three members: the respondent (ID number 1), his father (ID number 3), and his grandmother (ID number 2). The resulting relationship variables are depicted in Table 1. The decimal extension pertains to the line number, with the same line number equaling the same person. In later rounds, relationship variables indicate the relationship of each household member to the respondent but not relationships between household members.

Table 1. Example Structure of the Round 1 HHI2 Relationship Data

Line (ID) number 1 2 3
1 HHI2_REL1.01 (relationship of 1 to 1):
HHI2_REL1.02 (relationship of 1 to 2):
HHI2_REL1.03 (relationship of 1 to 3):
2 HHI2_REL2.01 (relationship of 2 to 1):
paternal grandmother
HHI2_REL2.02 (relationship of 2 to 2):
HHI2_REL2.03 (relationship of 2 to 3):
3 HHI2_REL3.01 (relationship of 3 to 1):
HHI2_REL3.02 (relationship of 3 to 2):
HHI2_REL3.03 (relationship of 3 to 3):
1 A code of "identity" for these variables indicates a relationship of "self."

By sorting through the relationship variables, researchers can identify all people in the household with a particular relationship to each other. For example, a user might want to count the number of the oldest women's children in the household. After identifying which household member is the oldest woman, the researcher can look at each of the relationship variables for that member and see which have a code of son or daughter. If the woman has an ID number of 3, the researcher would write a program that checks the variables for the relationship of member 3 to member 1, member 3 to member 2, and so on. Each member with a code of 49 (daughter) or 50 (son) is a biological child of the woman.

The household roster also includes variables that identify specific types of relationships among household members. For each household member, these variables provide the ID number of that person's biological mother (HHI2_MOMID), biological father (HHI2_DADID), and spouse or partner (HHI2_SPOUSEID or HHI2_PARTNERID) if they also live in the household.

In addition to the set of variables indicating the relationship of each household member to every other person in the household, the round 1 household roster includes a set of variables called HHI2_RELY. These variables provide the relationship of each person in the household to the youth respondent, eliminating the need for the detailed programming described above if the youth respondent is the person of interest. For example, to determine how the youth respondent is related to person 2, researchers can look at the variable HHI2_RELY.02. In the case presented in table 1 above, this variable would have a value of 4, indicating that household member 2 is the youth respondent's father.

Linking the rosters to other data from the same round. Most research requires linking variables from the household and nonresident rosters to other data collected during the parent and youth portions of the survey. This section describes how to identify the youth respondent, responding parent, and household informant, as well as the steps necessary to identify other key relatives of the youth and responding parent.

The youth respondent can be identified by using R05334. (YOUTH_ID.01). This variable provides the line number of the youth respondent on the round 1 household roster, which is denoted by .01. For example, if the value of R05334. is 1, the youth's ID number for the household roster is 1. All information about household member 1 on the roster pertains to the youth. If the value of R05334. is 2, then information about household member 2 pertains to the youth, and so on. As noted above, no NLSY97 youth respondent has an ID number higher than 5, so researchers who just want youth information will only need to examine data for the first five members on the household roster.

The parent roster (PARYOUTH) also contains a variable with the youth ID number on the household roster. However, researchers are advised to use the youth roster variable because it was created during the youth interview.

Identification of the responding parent requires a similar process. Researchers should use variable R07350. (PARYOUTH_PARENTID), which gives the ID number of the parent selected to be the responding parent at the end of the screener interview. Users should note that the youth roster also contains an ID variable called "ID of R 01 Resp Parent." However, because this variable is based solely on the screener and does not contain any updated information from the parent questionnaire, researchers are advised not to use this variable to identify the responding parent. (For information about identifying the responding parent's spouse or partner, refer to Parent Characteristics.)

Finally, the household informant is fairly easy to identify. Variable R05381. (INFORMANT!ID) provides the ID number of the informant. As with the parent and youth, this number is the position of the informant on the household roster.

Researchers also may want to identify key relatives of the NLSY97 youth, particularly the youth's parents, even if they were not respondents to any part of the survey. To facilitate this process, the round 1 data include identification variables that indicate the ID number of a given person on the household and nonresident rosters. Table 2 lists key youth roster ID variables available in the round 1 data set.

Table 2. Round 1 ID Variables for Key Relatives of NLSY97 Respondents

Reference number Question name (all 
begin with YOUTH)
R05318. _ADOPDADID.01 ID # of youth's resident adoptive father
R05319. _ADOPMOMID.01 ID # of youth's resident adoptive mother
R05323. _DADID.01 ID # of youth's resident biological father
R05327. _FOSTDADID.01 ID # of youth's resident foster father
R05328. _FOSTMOMID.01 ID # of youth's resident foster mother
R05336. _MOMID.01 ID # of youth's resident biological mother
R05339. _NONR1ID.01 ID # of youth's 1st non-responding resident parent
R05344. _NONR2ID.01 ID # of youth's 2nd non-responding resident parent
R05350. _NRDADID.01 ID # of youth's nonresident biological father
R05351. _NRMOMID.01 ID # of youth's nonresident biological mother
R05358. _SPOPARID.01 ID # of youth's resident spouse or partner
R05359. _STEPDADID.01 ID # of youth's resident stepfather
R05360. _STEPMOMID.01 ID # of youth's resident stepmother

For instance, the youth respondent answers questions about his or her biological mother. If researchers want to examine the characteristics of the biological mother contained in the household roster or nonresident roster (depending on her residence), they would first look at the MOMID variable. If the mother lives in the household, this variable will have a valid value. For example, a value of 5 means that all the roster variables for household member number 5 contain information about the mother. If there is no positive value for this variable, the next step is to look at the NRMOMID variable to obtain the position of the biological mother on the nonresident roster. As with the household roster, the value of this variable indicates the ID number of the biological mother on the nonresident roster.

Similarly titled variables are contained in the parent roster (PARYOUTH). However, the youth variables (YOUTH) are more accurate because they were adjusted to reflect corrected relationship data. Therefore, use the youth roster variables to identify the youth's parents.

Linking individuals on the rosters across survey rounds. Many researchers want to examine changes in the youth's household over time. Each round includes the collection of information about members of the youth's household organized in a household roster. Individuals will not necessarily remain at the same place on the roster in different rounds. For example, a father who had a line ID number of "3" in round 1 might have a line ID number of "2" or "4" in round 2.  Therefore, each household member is assigned a second, separate identification code, called a unique ID (UID). This unique ID will remain constant across survey rounds, even if household members move to a different place on the roster, so that researchers can identify a given household member in more than one round. For household members in round 1, UIDs are contained in questions HHI2_UID.01-HHI2_UID.16 on the household roster.

As mentioned above, individuals may move between the household roster and the nonresident roster. If such a move occurs, the person keeps the same UID number. Researchers can use the UID number to track individuals as they move in and out of the household. However, users should be aware that no updated information is collected about nonresidents other than biological, step-, or adoptive parents after round 1 (see Characteristics of Non-Residential Relatives for more information).

Table 3. Examples of UID Codes

Unique ID Round first reported
102 Round 1 Household Roster
202 Round 1 Nonresident Roster
199801 Round 2
199901 Round 3
200001 Round 4
200101 Round 5
200201 Round 6
200301 Round 7
200401 Round 8
200501 Round 9
200601 Round 10
200701 Round 11
200801 Round 12
200901 Round 13
201001 Round 14
201101 Round 15
201301 Round 16
201501 Round 17
201701 Round 18
201901 Round 19
202101 Round 20

Comparison to Other NLS Surveys: Information on the respondent's household is available for all cohorts for most survey years. Data generally include the age, gender, relationship to the respondent, and educational attainment of all occupants; the enrollment status of those of school age; and the occupation and weeks worked of residents age 14 and older. In the pre-1980 surveys of the Original Cohorts, data were generally collected only for family members living in the respondent's household and not for unrelated household members. For more complete information, refer to the appropriate cohort's User's Guide.

Survey Instruments: These questions are found in the screener and household roster sections of the round 1 Screener, Household Roster, and Nonresident Roster Questionnaire and section P5 of the round 1 Parent Questionnaire. In subsequent rounds, they are found in the household information section (see questions that begin with YHHI) of the Youth Questionnaire.

Related User's Guide Sections Household & Neighborhood Environment
Main Area of Interest Household Characteristics
Supplemental Areas of Interest Screener Extended
Screener Household