Speech Data in the NLSY97

NLSY97

NLSY79

Attitudes, Expectations, Non-Cognitive Tests, Activities

Speech Data in the NLSY97

In round 15, speech data were collected to learn about the relationship between a worker's speech and his/her labor market success, elaborating on the pilot study carried out by Grogger (2011). There were two main steps involved: collecting audio data and converting the audio data into numerical data suitable for regression analysis.

Important Information

The speech variables are best located using the question name (QNAME) search in NLS Investigator. Search for "Question Name starts with SPCH" to find this set of variables.

Audio data collection

Audio data were collected during round 15 of the NLSY97. The data were collected in response to two speech prompts, designed to capture both informal and formal speech. One prompt was administered at the end of the interview, when respondents were asked to recount the happiest moment (HM) in their life since the date of their last interview. The second question, administered during the employment section of the interview, involved a job-search (JS) role-playing exercise where respondents were asked the following:

Let's suppose you applied for a job that sounded really interesting to you and they called you and asked you to come in for an interview. How would you describe your skills, qualifications, and experience to me if I were the person interviewing you for this job? (Employed respondents heard a slightly different preamble to the question.)

All respondents who completed in-person interviews and who gave consent to be recorded were eligible to be assigned at least one speech prompt. Answers were recorded by the on-board microphone in each field interviewer's (FI's) laptop. To make the recording, the CAPI interview software was programmed to turn on the FI's laptop microphone for one minute once a prompt was reached. FIs were provided with instructions designed to keep the respondent talking for as much of that minute as possible.

Because of similarities between African American Vernacular English (AAVE) and Southern American English (SoAE), both stimulus questions were assigned to all African-American and Southern white respondents. Southern white respondents are defined as non-Hispanic whites who resided in the South Census region at age 12. A random sample of 500 respondents who were neither black nor Southern white were also to be assigned both speech prompts, as were roughly 295 other respondents for whom speech data was collected in 2006 as part of Grogger (2011) but who were not included in the other categories above. All other speakers, including non-Southern white respondents and all other respondents, were randomly assigned to only one of the speech prompts.

Table 1 provides data on round-15 speech-prompt sampling and response rates, disaggregated by race/region at age 12. Of the 8,984 original NLSY97 respondents, 7,423 were interviewed during round 15. Among those interviews, 6,579 were carried out in person. Among those, 6,080 respondents provided consent to be recorded and were thus eligible for this coding exercise. The share of round-15 respondents participating in in-person interviews and consenting to be recorded was .83 for blacks, .80 for both Southern whites and non-Hispanic whites, and .84 for the remaining group.

The center panel of Table 1 shows how eligible respondents were assigned to speech prompts. For the most part, the assignments followed the sampling plan fairly closely. All but seven of the black respondents, and all but two of the Southern white respondents, were assigned both questions. Among non-Southern whites and others, 795 respondents were assigned to both stimulus questions. Ten otherwise eligible respondents were not assigned either speech question.

The bottom panel of Table 1 provides counts of eligible respondents for whom audio files were actually generated by the interviews. There is a discrepancy between the number of respondents from whom audio data should have been collected and the number from whom it was actually collected. Of the 6,080 eligible respondents, audio files were obtained from only 4,907. The rate of loss among eligibles was 17 percent for blacks and Southern whites, 21 percent for non-Southern whites, and 20 percent for others. The panel also shows that there were black and Southern whites respondents for whom only one audio file was obtained, when there should have been two.

The reasons for this loss of data are unclear. NLSY project staff indicate that audio files appear not to have been captured for the 1,173 (=6,080-4,907) respondents who were eligible to be recorded but for whom no audio data are available, perhaps due to technical difficulties in the CAPI interviewing system. The loss of recordings is widely distributed among FIs, rather than being concentrated among a few, so appears to have been unintentional.

Producing numerical data from the audio files

To generate data suitable for the regression analysis, anonymous listeners were recruited to listen to the audio files and answer questions about the speakers. After listening to each audio file, listeners were asked to specify the speaker's sex, race/ethnicity, and region of origin. Three listeners were assigned to each audio file. Thus speakers who responded to both the HM and JS prompts have six listener reports, whereas speakers who responded to only one of the prompts have three. To deal with data security issues surrounding the use of potentially identifiable voice data, listeners were recruited from the pool of NORC FIs and research assistants. Data processing was carried out remotely using specially configured laptops that provided secure connections to NORC's computer network, where the audio files resided. All listeners received confidentiality training stipulated by both NORC and BLS.

Summary characteristics of the listeners are reported in Table 2. The modal listener was white and female, reflecting the demographics of the available workforce. Listeners were drawn from throughout the US, with disproportionately many Midwesterners. All listeners had completed high school; most had at least some tertiary education. The 22 listeners who listened to the JS audio files tended to be older, more Southern, and less educated than the 43 listeners who listened to the HM audio files (10 listened to both). Care was taken to ensure that speakers were not assigned to listeners who had interviewed them during round 15.

The HM files were processed first. All speakers with an HM audio file were in scope for HM data processing unless the file was empty or unintelligible. The top part of Table 3 shows that about 94 percent of the HM audio files were in scope, where this fraction varied from 89 percent for black speakers to 99 percent for non-Southern whites.

Budgetary issues limited the scope of processing for the JS files. The goals for JS file processing were to maximize the number of blacks and Southern whites for whom both HM and JS data were available and to maximize the number of non-Southern whites for whom data from at least one of the speech prompts would be available, while meeting the project budget constraint. A handful of "other" speakers were processed as well. As with the HM data, JS files that were empty or inaudible were deemed out of scope. The middle part of Table 3 shows that 83 percent of the available JS files for black speakers were processed, compared to 92 percent of those for Southern whites and 79 percent of those for non-Southern whites. Speech data from at least one prompt are available for a total of 4,225 NLSY respondents.

Table 1. Round-15 response counts by respondent's race and region at age 12

Race/region	Black	Southern white	Non-Southern white	Other	Total
Original 1997 sample	2,335	1,160	3,253	2,236	8,984
R15 respondents	2,036	931	2,588	1,868	7,423
In-person interviews	1,833	797	2,269	1,680	6,579
...and consent to record	1,698	741	2,079	1,562	6,080
Speech prompt assignment:
Both questions	1,691	739	257	538	3,225
HM only	1	0	906	516	1,423
JS only	6	2	913	501	1,422
No assignment	0	0	3	7	10

At least one audio file	1,402	616	1,638	1,251	4,907
Both questions	1,283	570	194	419	2,466
HM only	22	6	706	400	1,134
JS only	97	40	738	432	1,307

Notes: HM = happiest moment; JS = job search.

Table 2. Percentage distribution of listener characteristics, by speech prompt

Listener Characteristics	Happiest Moment (HM) Prompt	Job Search (JS) Prompt
	(1)	(2)
SEX
Male	27	16
Female	73	84
Total	100	100

RACE/ETHNICITY
White	83	84
Black	13	15
Hispanic	2	1
Other	2	0
Total	100	100

REGION OF RESIDENCE
Northeast	21	19
Midwest	37	35
South	21	37
West	21	10
Unknown	0	0
Total	100	100

LEVEL OF EDUCATION
HS diploma or GED	5	24
HS and some college	38	33
Bachelor's degree or higher	57	43
Total	100	100

Mean age of listener (years)	48	54

Top-Right Links

Speech Data in the NLSY97

Attitudes, Expectations, Non-Cognitive Tests, Activities

Speech Data in the NLSY97

Audio data collection

Producing numerical data from the audio files

Table 1. Round-15 response counts by respondent's race and region at age 12

Table 2. Percentage distribution of listener characteristics, by speech prompt

Main menu

Cohorts

Helpful Links

Top-Right Links

You are here

Speech Data in the NLSY97

Attitudes, Expectations, Non-Cognitive Tests, Activities

Speech Data in the NLSY97

Audio data collection

Producing numerical data from the audio files

Table 1. Round-15 response counts by respondent's race and region at age 12

Table 2. Percentage distribution of listener characteristics, by speech prompt

Main menu

Cohorts

Helpful Links