Race, Ethnicity & Immigration Data

Race, Ethnicity & Immigration Data

Important Information About Using Race, Ethnicity and Immigration Data

  • Race and ethnicity variables for household members are based on information collected on the Household Screener; in which race and one ethnic background for each household member were recorded. 
  • The interviewer's identification of the respondent's race can be subjective. Each interview from 1979-1986 and 1988-1998 collected information on the interviewer's direct observation of the race of the respondent ("black," "nonblack/non-Hispanic," or "other"). 
  • No special instructions are provided within the Question by Question Specifications as to how the interviewer is to code race.

Additional instructions for coding race, ethnic origin, and the racial/ethnic identifier variable can be found in the Household Screener and Interviewer's Reference Manual (1978) and in a NORC memo dated 10/4/78 available from NLS User Services.

The following race and ethnicity variables are available for NLSY79 respondents: 

  1. a racial/ethnic variable based on the sample identification code assigned by NORC
  2. a series of self-reported ethnic origin variables collected during the 1979 and 2002 surveys
  3. a set of interviewer identifications of the race of the respondent at the time of the interview
  4. racial/ethnic identification for current and past spouse/partners
  5. variables representing the respondent's immigration history and status collected during the 1990 survey
  6. a 1979 variable indicating whether a foreign language was spoken in the house during the respondent's childhood
  7. a series of variables recorded by the interviewer indicating whether the survey was administered in English or another language
  8. race/ethnicity variables for each family member listed during the 1978 screener
  9. race of the interviewer where available at each interview
  10. country of origin of the respondent's parents and the respondent's country of birth, available on the restricted Geocode release

Race and ethnic origin information is also available for each household member identified during the 1978 household screening. In 2002 respondents were asked to identify their race/ethnicity using questions that conformed to Federal government definitions. Of related interest is a series of immigration questions, fielded in 1990, that included the collection of information on country of citizenship at the time that foreign-born respondents entered the U.S.


The variable 'Racial/Ethnic Cohort from Screener' (R02147.) designates the respondent as "Hispanic," "black," or "nonblack/non-Hispanic" and provides the basis for weighting NLSY79 data. This variable is collapsed from R01736., 'Sample Identification Code,' which includes such values as "supplemental male black" or "cross-sectional female Hispanic." This code was assigned by NORC to each respondent based on information gathered during the 1978 household screening. In the creation of the 'Sample Identification Code' and thus the 'Racial/Ethnic Cohort' variable, both race and ethnic origin information collected at the time of the 1978 household screening were used. Interviewers conducting the screening were instructed to:

  1. code race by observation into three categories, "nonblack/non-Hispanic," "black," or "other"
  2. inquire about the ethnicity of all household members age 14 or above
  3. but assign ethnicity, without asking, to those members who were under age 14

Coding procedures used by NORC to assign the "Hispanic," "black," and "nonblack/non-Hispanic" identifications to respondents included the following classification guidelines:

"Hispanics" were those who self-identified as Hispanic, whose ethnicity screener code was 1-4
  1. Mexican American, Chicano, Mexican, Mexicano
  2. Cuban, Cubano
  3. Puerto Rican, Puertorriqueno, Boriccua
  4. Latino, Other Latin American, Hispano, or Spanish descent. Persons who did not self-identify as Hispanic but who met the following conditions were also classified as "Hispanic": 
    • those who identified themselves in the ethnic origin categories that included Filipino (code 6) or Portuguese (code 13)
    • those whose householder or householder's spouse reported speaking Spanish at home as a child
    • those whose family surname is listed on the Census list of Spanish surnames
  • included those for whom race was coded "black" and ethnic origin was "non-Hispanic" or those whose ethnic origin was coded black, Negro, or Afro-American (code 5) regardless of race coding
  • included those whose race was coded "white" or "other" and who did not identify themselves as either black or Hispanic in answer to the ethnicity question. Instructions to interviewers for coding race included coding in the "other" category those persons who were Japanese, Chinese, Vietnamese, Asian Indian, Native American, Korean, Eskimo, Pacific Islander, or of another race besides black or white.
  • Father's race was to be used to assign race for those of mixed descent except for some cases of those under age 14 of Spanish descent.  Users should note that this decision rule is different from that applied to the NLSY79 children, for whom the mother's race is used.  Spanish origins were to be given preference; if at least one ethnicity mentioned was of Spanish origin, the Spanish origin was to be coded (or, for those under 14, if at least one parent was Hispanic, the Hispanic parent's ethnicity was assigned).

A series of ethnic identification variables, '1st-6th Racial/Ethnic Origin' and 'Racial/Ethnic Origin with Which R Identifies Most Closely' (R00096.-R00102.), provide extensive ethnicity information. Respondents were asked during the 1979 interviews to name the racial/ethnic origins with which they identified.  A listing of more than 20 categories, including "Black," "English," "French," "German," "American Indian," "Irish," "Mexican," "Mexican-American," and "Puerto Rican," were presented on a Show Card. If a respondent offered more than one origin, he or she was also asked for the ethnic group with which he or she most closely identified.  Users should be aware that frequency counts for the coding category "Indian American, or Native American are unusually high.  About 5 percent of respondents reported this racial/ethnic origin, compared to Census estimates of approximately 0.5 percent of the population. This may have resulted from some respondents' misinterpretation of the term "Native American."  Table 1 compares frequencies of the 1979 first (or most closely held) ethnic identification with the NORC assigned racial/ethnic identification.

Table 1. Ethnicity by Racial or Ethnic Cohort from Screener (Unweighted Data)

Respondent's Self-Identification   NORC-Assigned Race/Ethnicity
Racial/Ethnic Group1 Total   NonBlack/
or Latino
Total 12686   7510 3174 2002
Black 3049   19 3017 13
Total Hispanic or Latino 1834   46 5 1783
  Cuban 116   1 0 115
  Chicano 59   0 0 59
  Mexican 383   5 0 378
  Mexican-American 734   15 1 718
  Puerto Rican 328   7 1 320
  Other Hispanic or Latino 118   7 0 111
  Other Spanish 96   11 3 82
Total European 5281   5100 82 99
  French 311   290 10 11
  German 1395   1376 5 14
  Greek 31   29 0 2
  English 1561   1476 51 34
  Irish 949   933 3 13
  Italian 497   474 7 16
  Polish 238   234 3 1
  Portuguese 97   88 3 6
  Russian 45   45 0 0
  Scottish 122   120 0 2
  Welsh 35   35 0 0
Total Asian 117   93 11 13
  Asian Indian 22   20 2 0
  Chinese 26   22 4 0
  Filipino 43   33 4 6
  Japanese 19   14 0 5
  Korean 6   3 1 2
  Vietnamese 1   1 0 0
Hawaiian/Pacific Islander 20   17 0 3
American Indian 622   585 17 20
Other 779   736 21 22
American 743   692 10 41
None 2 241   222 11 8
1 R00102., 'Racial/Ethnic Origin with Which R Identifies Most Closely,' is used unless it was not answered; otherwise R00096., '1st or Only Ethnic Origin' is used. Those listing only one ethnic background did not answer R00102.
2 Includes totals of 98 "don't know," 132 "none," 10 "invalid skips," and 1 "refusal."


In 1990, NLSY79 respondents born outside the United States, its territories, or Puerto Rico were asked a series of questions on their immigration history and visa status. Dates of first and most recent entrance into the United States to live for six or more months and information on whether the respondent was the principal entrant/immigrant were collected. For respondents' or principal entrant/immigrants' first and most recent entry or change in visa/immigration status, details were gathered on:

  1. visa or immigration status at entry date
  2. form of temporary entry visa
  3. citizenship status (that is, citizen or permanent resident alien) and relationship of the sponsoring relative
  4. country of citizenship at entry date or date of change of status

Also recorded for the respondent was information on:

  1. current citizenship/residence/visa status in the United States
  2. residence inside/outside the United States
  3. expectations to return to the United States to live permanently or to return to his or her country of birth to live permanently
  4. the total number of years spent outside the United States since initial entry

Of related interest is the variable, 'Is R a Citizen of the U.S.' available from the 1984 and 1990 interviews.

Foreign Language Used or Spoken

For each household member, information is available from the screener on presence of a Spanish surname and whether Spanish was the language spoken in the home when that individual was a child. The 1979 interview asked whether a foreign language (Spanish, French, German, other) was spoken at home during the respondent's childhood. In addition, interviews record for each survey whether English, Spanish, or another foreign language was used to administer the Household Interview Forms ('English or Foreign Language Used for Household Record') and questionnaire ('Int Remarks - Was Interview Conducted in English or Foreign Language').

Comparison to Other NLS Cohorts: Race is available for all cohorts; ethnicity is available for all cohorts except the Older Men and Young Men. Users should be aware that coding categories for race and ethnicity have varied among cohorts and over time. For more precise details about the content of each survey, consult the appropriate cohort's User's Guide using the tabs above for more information.


NORC. 1978 Household Screener and Interviewer's Reference Manual. Chicago, IL: National Opinion Research Center - University of Chicago, 1978.

Survey Instruments and Documentation
  • Race and ethnicity variables originating from the screener are located on the second page of the Household Screener. Questions concerning the ethnicity of the respondent are included in the "Family Background" section (Section 1) of the 1979 questionnaire. Interviewer remarks regarding race are located in the final section ("Interviewer's Remarks") of each questionnaire. Immigration questions are located in Section 13, "Immigration," of the 1990 questionnaire.
  • For further information on the coding of race and ethnicity in the Household Screener, see the 1978 Household Screener and Interviewer's Reference Manual (NORC 1978). Those needing additional information on coding procedures should request a copy of a NORC memo dated 10/4/78 available from NLS User Services.
Areas of Interest
  • 'Birthplace (Country and State) of R's Mother/Father' and 'Birthplace (Country) of Father's Father' are available in "Geocode 1979" (on the Geocode CD) areas of interest. 
  • Race and ethnicity variables are included in the following areas of interest: 'Racial/Ethnic Cohort from Screener' is a "Common" variable. Ethnicity variables originating from the 1979 interview as well as all immigration variables have been placed in the "Family Background" area of interest. The interviewer's remarks variables are located in "Interviewer Remarks." Race variables for household members originating from the 1978 household screening are located in "Misc. 1979."  'Current Residence In "Misc. xxxx."