Chapter 3 continued:  Guide to the Mature Women Data

Return to beginning of chapter


 

3.2  Types of Variables

Four types of variables are present in the Mature Women data files. The type of variable affects the title or variable description which names each variable and the physical placement of the variable within the codebook. Types of variables include:

  1. Direct raw responses from a questionnaire or other survey instrument.

  2. Edited variables constructed from raw data according to consistent and detailed sets of procedures (e.g., occupational codings, *KEY* variables, etc.).

  3. Constructed variables based on responses to more than one data item either cross-sectionally or longitudinally and edited for consistency where necessary.

  4. Variables provided by the Census Bureau or another outside organization based on sources not directly available to the user (e.g., characteristics of respondents’ geographical areas).

User Notes: In general, CHRR does not impute missing values or perform internal consistency checks across waves. Data quality checks most often occur in the process of constructing cumulative and current status variables such as ‘Highest Grade Completed.’

Reference Numbers

Every variable within the main NLS data set has been assigned an identifying number that determines its relative position within the data file and documentation system. Persons contacting NLS User Services should be prepared to discuss their question or problem in relationship to the reference number(s) of the variable(s) in question.

Reference numbers, once assigned, remain constant through subsequent revisions of the files. Reference numbers are assigned sequentially, with variables from the first survey year having a lower reference number than those variables specific to the second year, and so forth. Occasionally, variables are created sometime after the year in which the data were actually collected. These variables are frequently given a reference number that reflects the year in which the actual data were gathered rather than the year the created variable was constructed.

Table 3.2.1 lists reference numbers for each survey year since 1967 for the Mature Women.

Table 3.2.1 Mature Women Reference Numbers by Survey Year

Survey Year

Reference Numbers

Survey Year

Reference Numbers

1967

R00001.-R00813.

1982

R05270.-R06640.

1968

R00833.01-R00868.

1984

R06650.-R07192.

1969

R00868.01-R01333.01

1986

R07203.-R07809.

1971

R01334.-R02047.

1987

R07820.-R08863.

1972

R02048.-R02876.

1989

R08865.-R10077.

1974

R02878.-R03067.

1992

R10080.-R13027.

1976

R03080.-R03273.50

1995

R16007.-R34923.

1977

R03280.-R04540.

1997

R34950.-R42503.

1979

R04542.-R04892.

1999

 R42527.-R54394.

1981

R04900.-R05267.

 2001

  R54400.-R63276.

Variable Titles

Every variable within NLS main file data sets has been assigned an 80 character summary title that serves as the verbal representation of that variable throughout the hard copy and electronic documentation system. Variable titles are assigned by CHRR archivists who endeavor, within the limitations described below, to capture the core content of each variable and to incorporate within the title (1) common words that facilitate easy identification of comparable variables; (2) UNIVERSE IDENTIFIERS that specify the subset of respondents for which each variable is relevant; and (3) for some variables, REFERENCE PERIODS that indicate the period of time (e.g., survey year or calendar year) to which these data refer. Universe identifiers and reference periods are discussed below.

Universe Identifiers: If two ostensibly identical variables differ only in that they refer to different universes, the variable title will include a reference to the applicable universe.

Example 1:       ‘Reason for Being OLF, 77 (Not Empld, Have Worked)’

                        ‘Reason for Being OLF, 77 (Not Empld, Not Worked)’

Reference Periods: Variable descriptions may include a phrase indicating the time period to which these data refer. The following general conventions apply:

Survey Year: When the variable title includes either the phrase XX INT (82 INT) or the year (e.g., 67) without the year being preceded by the preposition “IN,” this indicates the survey year in which that variable was measured, not necessarily the year to which it applies.

Example 2:        ‘Move to Current Residence - Prior SMSA, 82 INT’ refers to a residential move occurring in the period before the 1982 interview.

Example 3:        ‘Number of Weeks Worked in Past Year, 67’ refers to the weeks worked in the 12-month period preceding the 1967 survey.

Calendar Year: When a date follows a verbal description of a variable and is part of the prepositional phrase “in XX,” the date identifies the calendar year for which the relevant information was collected. The title in Example 4 refers to payments received in calendar year 1988 with data collected during the 1989 survey.

Example 4:        ‘Income from Social Security Payments Based on R’s Work Record in 88? 89.’

 

User Notes: All searches for NLS variables are essentially searches for variable descriptions or titles. Electronic searches of NLS variables via the NLS data file accessing methods ultimately produce listings of variables by their reference number and variable description or title.

Flexibility in variable title assignment for raw data items is restricted by (1) the actual wording of the question as it appears within the survey instrument; (2) precedent, i.e., how that type of variable has been titled in previous survey years; and (3) the maximum allowable length for variable titles. An attempt is also made to include key phrases in variable titles so that large groups of variables with similar or related subject matter can be easily identified.

Users should be careful not to assume that two variables with the same or similar titles necessarily have the same (1) universe of respondents or (2) coding categories or (3) time reference period. While the universe identifier and reference period conventions discussed above have been utilized, users are urged to consult the questionnaires for skip patterns and exact time periods for a given variable and to factor in the relevant fielding period(s).

Variables with similar content (e.g., information on respondents’ labor force status) may have completely different titles, depending on the type of variable (raw versus created).

Example 1: ‘Employment Status Recode’ (ESR) is the created or reconstructed version of the ‘Activity Most of Survey Week’ raw variable. The ‘Activity’ variable is derived from the first item of the full series of questions used by the Department of Labor (DOL) to obtain employment status; the title reflects questionnaire content. ESR, on the other hand, reflects the procedure used to recode the ‘Activity’ variable. This produces a constructed variable for all NLS respondents based upon responses to the ‘Activity’ question and all other questions used by the DOL to obtain employment status. These other questions serve to qualify and refine employment status beyond the answer to the initial ‘Activity’ question. (Note that ESR has been replaced by a similar variable, MLR, beginning in 1995; see the “Labor Force Status” section of this guide for details.)

Finally, different archivists over a period of three decades have performed the task of assigning variable descriptions to data from the NLS cohorts. While every effort has been made to maintain consistency, users may find some differences in variable titles. Two primary sources of variation exist in Original Cohort variable title assignment. The first is systematic error in which identical questions may have the same question wording across the four Original Cohorts but slightly different variable titles. The rule before 1995 was to make title consistency within a cohort of highest priority. Starting in 1995, joint fielding forced the archivists to choose one title and cross-reference the other cohort’s title in the archivist notes. The second variation is attributed to random error due to spacing or punctuation errors. The sorting process that produces variable title listings usually places these variables near if not next to the series of interest.

Identifying Mode of Interview

User Notes: There are important differences between the content of telephone and personal interviews. In the late 1960s and early 1970s, most of the interviews were conducted in person, usually at the respondent’s home. There was one attempt at a mail survey for the Older Men and Mature Women cohorts in 1968; however, the low response rate led to dropping that type of contact. After the first five years, the decision was made to conduct a major survey every five years and two telephone surveys during the five-year span so that problems of recall could be avoided and contact could be maintained with the respondents.

There are several different ways of identifying whether a survey is a personal or telephone interview. Users can (1) refer to Table 2.4.1 in the “Interview Schedule & Fielding Periods” section of Chapter 2, which depicts the type of interview by survey year, or (2) examine variable titles assigned to questions of similar content. Differences in what appear to be comparable variables reflect variations in the wording of the question or the fact that the reference period for an identically worded question may be different in a personal versus a telephone interview. Questions that refer to the last five years were usually found in a personal (or five-year) interview. This difference means that some questions were only asked in the five-year surveys and some were asked only in the telephone surveys. Users conducting longitudinal analysis need to change their variable creation procedures to account for the differences in data collection between the early years of uninterrupted personal interviews and subsequent survey years when telephone interviews were used.

Starting with the 1989 survey, the collection pattern was altered again; a decision was made to conduct a personal interview every other year and collect data going back to the date of the last interview. However, the scheduled 1991 survey was delayed until 1992 due to the demands of the 1990 decennial census and the decision to interview the Young Women first, in 1991. The scheduled 1994 survey was then delayed until 1995 so that the women’s cohorts could be interviewed at the same time using the same CAPI/CATI instrument. Biennial surveys have been conducted in 1997, 1999, and 2001.

When analyzing data, users should remember that not all surveys were conducted during the same season of each survey year. Responses to labor force status questions, for example, may differ significantly if fielding occurred during the summer versus winter months. See the discussion of fielding periods in chapter 2 of this guide for more information.


Go to next section of chapter    Return to top    Return to beginning of chapter
Return to Table of Contents