Sample Weights

Sample Weights

Appropriate sample weights are available in each year to adjust the un-weighted sample cases for the minority oversamples and year-to-year sample attrition. The sample weights for younger children and young adults:

  1. adjust the un-weighted data for sample attrition of mothers and their children since the first survey round (1979) and the sample reduction due to the loss of the military and economically disadvantaged white oversample and
  2. adjust the sample for the over-representation of black and Hispanic youth. 

For those interested in generating population estimates for prior survey rounds, sample weights for those survey rounds are available. 

Using these weights translates the un-weighted sample of children into a population that represents all children who have been born by that date to a nationally representative sample of women who were 14 to 21 on December 31, 1978. Beginning in 2002, a revised algorithm was used to compute the sample weights. For the 1994-2000 survey years, two sample weight variables are available for each year: the originally released sample weight and a revised weight using the new algorithm.

Weights are computed only for younger children who have been interviewed or young adults who have been fielded and interviewed in a given year. Children not assessed and young adults not interviewed (or interviewed but not fielded) in a given year are assigned a weight of zero for that year. Table 1 lists the complete set of child, young adult, and mother sample weights.

Table 1. NLSY79 Mother, Child, and Young Adult Sampling Weights

Cohort Question Name Survey Years
Mother SAMPWT79 1979
Child   CSAMWTyy 1986-1998
CSAMWTyyyy 2000-current
CSAMWTyyyy_REV 1986-2000
Young Adult  YAyyWEIGHT 1994-current

Revised Sampling Weights. Starting with the 2002 child survey round, an updated automated computation procedure was instituted to allow users to create custom sets of weights for analyses that require more than cross-sectional weighting information. The automated process was designed both to sum the same population totals and to follow the same procedures as done previously. Because of slight differences in the results, an additional set of revised cross-sectional weighting variables is provided for the Child survey years 1986-2000 (CSAMWTyyyy_REV) and for Young Adult survey years 1994-2000 (YAyyWEIGHT_REVISED). Users should find minimal differences between the two series of sample weights but are strongly encouraged to check if switching between the two types of weights affects their results.

Important Information

Beginning in 2002, the NLSY79 Child and Young Adult sampling weights were constructed using an updated algorithm. This updated algorithm was also used to create revised weights for earlier survey rounds (identified by "REV" or "REVISED" in the question name).

The mother's sampling weight SAMPWT79 can be found in the Child-Young Adult data in the SAMPLING WEIGHTS Area of Interest.

The Child sampling weights have been assigned to two Areas of Interest in the documentation: (1) SAMPLING WEIGHTS and (2) ASSESSMENT for the relevant years.

The Young Adult weights have been assigned to the following two Areas of Interest in the documentation: (1) SAMPLING WEIGHTS and (2) YA COMMON KEY VARIABLES.

The child/young adult sample weights adjust for sample attrition of NLSY79 mothers and children (including the loss of the military and white oversamples) and for over-representation of black and Hispanic respondents. Each set of cross-sectional child sample weights is computed by multiplying the mother's 1979 sample weight by a factor that is the reciprocal of the rate at which children in particular age/sample-type/sex cells are assessed or interviewed. 

The current public release contains a complete set of custom child weights for all child survey years in which values are assigned according to the following criteria:

  • Each non-interviewed child's weight = 0.
  • Each interviewed child's weight is equal to the mother's weight multiplied by the number of children her interviewed child represents.
  • Every interviewed child represents himself or herself plus the number of non-interviewed, known children, plus the number of children estimated to have been born to non-interviewed mothers. This last set of imputed children is determined by determining the number of years since the mother was last interviewed and assigning the same number and ages of children born based on what was reported for interviewed mothers of the same sex and race.

In the other NLS cohorts, the cell collapsing code is relatively complex and allows the program to merge almost any set of adjacent cells. In creating the weights for the Child-Young Adult cohort the cell collapsing code is simpler. Generally cells are collapsed as follows:

  • Only the end points are collapsed (oldest and youngest kids)
  • The end point is the same for males and females (to follow how it is done prior to the custom weighting program)
  • Cells are collapsed if there are fewer than 10 observed children

Customized Longitudinal Weights. Researchers who need to weight individuals who participated in multiple survey rounds (i.e., such as all children who participated in 1988-2008) are referred to the custom weighting program. Caution should be used when comparing weighted estimates across years since the composition of the sample can change in subtle ways depending on who was interviewed. The custom weighting program also offers the option of getting weights for a specific set of respondent ids.

Sample Weights to Identify Interviews. Users can also employ the Child sample weight variables to delineate their analysis sample and to identify respondents interviewed in each survey round. Restricting the sample to those cases with a sample weight value greater than zero "0" will yield the set of respondents interviewed and/or assessed in a particular survey year. A child sample weight (CSAMWTyyyy_REV or CSAMWTyyyy) GT "0" will indicate the number of children with either a Mother Supplement and/or Child Supplement.

Please note that in 2000, four Young Adult respondents who were part of the pool of oversample cases that were not fielded were inadvertently interviewed. For these four respondents, their interview data are included in the public release, but their sampling weights are set to zero. Similarly, since 2010, in each survey round some Young Adults over thirty who were not fielded ended up being interviewed. For these respondents, their interview data are included in the public release, but their sampling weights are set to zero. More detail on sample weights and interview status can be found in the section Missing Data: Noninterviews and Item Nonresponse.

Where to Find the Sample Weights. The list of sample weight variables for children and young adults appears in Table 1 above.

The Child sample weights are assigned to both the SAMPLING WEIGHTS and the yearly ASSESSMENT areas of interest. Children who have been assessed or interviewed in a given year have values greater than 0 on their sample weight for that year.

The Young Adult sample weights for each year are assigned to both the SAMPLING WEIGHTS and the YA COMMON KEYVARS areas of interest. These YA sample weight variables are specific to young adults interviewed in that year so that any young adult not interviewed or any child who is not a young adult in that year is assigned a value of "0." The Young Adult sample weights are assigned to both the SAMPLING WEIGHTS and the YA COMMON KEY VARIABLES areas of interest.

A Note about Sampling Weights

NLSY79 Child and Young Adult 1994-2016 data set includes revisions to several Young Adult sampling weights as well as the replacement of two revised Child sampling weights.

Young Adult Sampling Weights. Beginning in 2010, young adults over age 30 are only interviewed every four years. The interviewed sample is selected by age as of December 31 of the survey year, so that approximately half of the older young adults are eligible each round. Since 2010, young adults age 31-32, 35-36, 39-40, 43-44, etc. as of December 31 of the target year have not been fielded.

The algorithm creating the round-specific sampling weights did not adequately account for this change in fielding, leading older YAs in the age groups that were fielded (33-34, 37-38, 41-42, etc.) to receive disproportionately high weights. The algorithm has been readjusted and the round-specific sampling weights for the interviewed YAs over age 30 have been replaced in the following variables:

Y26159.00    [YA10WEIGHT]
Y29663.00    [YA12WEIGHT]
Y33318.00    [YA14WEIGHT]

Child Sampling Weights. An error in the code creating the revised round-specific sampling weights that were released in 2002 led to the 1986 and 1988 Child Sampling weights to be incorrectly calculated. The following weights have been replaced for all affected children:

C05812.01    [CSAMWT1986_REV]
C08007.01    [CSAMWT1988_REV]