Appendix 6: Event History Creation and Documentation

National Longitudinal Survey of Youth - 1997 Cohort

Appendix 6: Event History Creation and Documentation

Schooling Event History Arrays

There are three sets of schooling event history arrays; monthly grade school histories, yearly grade school histories and monthly college event histories.  Together, these three sets of information provide researchers with a complete overview of a respondent's education. Grade school histories which cover kindergarten through 12th grade are only available from rounds 2 to 12. Grade school histories were discontinued after round 12 because by this round the youngest NLSY97 respondent was in their mid-twenties and almost none were in school or providing information about their grade school activities. College event histories start in round 2 and at present are ongoing.

The grade school education arrays are somewhat different than the other event history arrays. Information on a respondent's education is reported in both yearly and monthly variables. This approach is used to combine information from the youth questionnaire, which collects more detailed data, and from the round 1 parent questionnaire, which presented information only for each year. Users should be aware that, because questions were not identical in the round 1 parent questionnaire and the round 2 youth questionnaire, the transition between the two data sources was not seamless and some information for the yearly variables had to be imputed. If they feel that a given value is questionable, researchers may wish to compare created yearly variables to the raw data and to the monthly schooling arrays described below.

Yearly Grade Schooling Variables

A set of grade school variables provides information for each year beginning in 1980, the year when the first information is available in the survey, through round 12. In general, these variables refer to the school year rather than the calendar year. That is, 1991 in a variable title or in the data for a variable generally indicates the school year starting in fall 1991 and ending in spring 1992.

1. SCH_YEAR_to_GRADE
This array presents the grade the respondent attended during the school year. The last four digits of the question name indicate the school year. For example, SCH_YEAR_to_GRADE.1990 refers to the grade attended by the respondent during the school year that starts in fall 1990 and ends in spring 1991.

2. SCH_GRADE_to_YEAR
This array refers to the year the respondent attended a certain grade. For example, if the respondent attended second grade in 1992-93, then SCH_GRADE_to_YEAR.2 would have the value 1992.

3. SCH_CHANGES
This array counts the number of times the respondent changed the school attended during the school year. For example, SCH_CHANGES.1990 shows how many different schools the respondent attended during the school year that started in fall 1990 and ended in spring 1991.

4. SCH_MNTHS_MISSED
This array presents the number of months during the school year that the respondent did not attend school. For example, if SCH_MNTHS_MISSED.1990 has a value of 3 for a respondent, then that respondent had a gap in attendance of three months during the school year that started in the fall of 1990 and ended in the spring of 1991. A gap is defined as missing school for one or more months (not including summer vacation); gaps do not have to be consecutive.

5. SCH_SUMMER_SCHOOL
This array refers to extra school classes during an educational break in a given school year, such as summer school. For example, SCH_SUMMER_SCHOOL.1990 shows whether the respondent attended school during a break in the 1990-91 school year.

6. SCH_SUSPENSIONS
This array counts the number of days during the school year the respondent was suspended from school. For example, if SCH_SUSPENSIONS.1990 has a value of 3 then the respondent was suspended from school 3 days during the school year that started in fall 1990 and ended in spring 1991.

7. SCH_GRADE_PROGRESS
This array has positive values if there are any special events that occurred during the school grade. For example, a positive value in SCH_GRADE_PROGRESS.2 indicates that the respondent was skipped or demoted during second grade. Researchers should note that parents might have been confused as to how to answer the skip grade questions asked during the interview. For example, there are parents who say their child skipped from 5th to 6th grade, while others say from 4th to 6th grades. Both of these cases are probably stating that the child missed most or all of the 5th grade. To resolve this ambiguity, the code states that if a child is skipped consecutive years then the first year (i.e. 5th grade) was missed. If a parent reports non-consecutive years (i.e. 4th to 6th) then the program assumes the year(s) in the middle are the ones not attended.

8. SCH_YEAR_PROGRESS
This array refers to any special events that occurred during the school year. The question name's last four digits indicate the school year this variable refers to. For example, SCH_YEAR_PROGRESS.1990 shows special events that occurred during the school year that starts in fall 1990 and ends in spring 1991. The special events, such as grades skipped or demoted to, are defined in the same way as in the previous array.

User Notes

As discussed in the Educational Status and Attainment section, there are a number of apparent inconsistencies in the raw survey data with respect to grade progression. Through a data quality review after round 6, survey staff determined that the complexity of the survey questions, coupled with problems in the way the data were interpreted during the programming of the event history arrays, led to a significant number of spurious repeated and skipped grades. For example, because of errors in reporting or programming, it may appear that a respondent completed 10th grade twice and then jumped ahead to 12th grade when in fact the respondent had a normal progression through the grades. The following paragraphs detail the six main problems found in the data and the steps taken to correct them.

  1. Survey staff reviewed the grade reported in the initial 1997 survey and the date of high school graduation. While the detailed school enrollment loops ask for information that individuals may not always report correctly, the date of graduation from high school is a salient event that respondents should report correctly with a high degree of accuracy. Using this information, survey staff identified all respondents who moved from the grade reported in 1997 to high school graduation in the expected amount of time. If a respondent's graduation date indicates that the respondent should have a normal school progression--completed one grade per school year--the event history program flagged the respondent and imposed a normal progression on the event history variables.
  2. A number of respondents enroll in college courses while they are still in high school. Event history arrays only contain a single grade attended for a given time period, and the original event history program was written so that college courses were given precedence over high school. For example, if an 11th-grader also took a freshman-level college class during first semester, the program assigned a grade of "13" (first year in college) for that semester. If the student then finished 11th grade but did not take any college classes during second semester, it would appear in the data that the student jumped ahead to year 13 of schooling and then back to 11th grade during the course of a single year. This resulted in a number of extra promotions and regressions. Consequently, the event history program has been rewritten to prioritize high school over college, removing these spurious grade changes.
  3. Some respondents provided a high school graduation date but then reported additional secondary school enrollment after that date. Survey staff decided to exclude post-graduation secondary school enrollment from the event histories, although this information is preserved in the raw data for researchers who might be interested in the additional training received by respondents after graduation.
  4. While answering the schooling questions, some respondents reported initial enrollment at a school but apparently did not understand that they should report each grade attended at that school in a separate loop within the schooling section. This resulted in some respondents appearing to remain in one grade for a long period of time, particularly if they had missed one or more interviews, and then apparently jumping ahead several grades. If, for example, a respondent appeared to be in 9th grade for 3 years and then jump ahead to 12th grade, the most likely reason is that he or she did not understand the schooling questions and actually did progress normally through 10th and 11th grade. The event history program now flags these respondents and adjusts their schooling history to follow a normal grade progression.
  5. In a number of cases, respondents appear to jump backward and then forward across multiple grades. For example, some respondents were listed as attending 9th grade, then 1st grade, then 11th grade. The most likely explanation for this pattern is a data entry error where the interviewer accidentally dropped the zero from 10th grade. Jumps in a normal school progression which appear to be caused by a missing digit in a two-digit grade were corrected.
  6. Finally, data review of individual cases indicates that, when asked what grade they had first attended at a given school, some respondents reported instead the first grade offered at that school. As with the problem in the previous paragraph, this causes respondents to appear to jump backwards across a number of grades and then jump forward again the next year. Hand edits were made to adjust the event histories for these respondents to a normal grade progression.

The six changes described above significantly reduced the number of abnormal grade progressions found in the event history SCH_GRADE_PROGRESS variables. About 3/4 of the promotions and demotions found in the raw survey data for rounds 1-6 appear to be the result of reporting or programming errors. After the corrections were implemented, about 100 demotions and 570 promotions remained. Although it is possible that errors remain, based on inspection of the data survey staff feel that the vast majority of these grade changes reflect actual atypical progressions. Additional information about younger respondents' schooling continues to be collected, and staff will continue to review the data to determine whether newer information indicates that any of the remaining promotions or demotions are artifacts of inaccurate reporting.

Monthly Grade Schooling Variables

Starting in round 2, three types of monthly arrays are created. Each array captures information for each month from the respondent's interview date in round 1 to the round 12 interview date.

1. SCH_STATUS
This array reports the respondent's enrollment status during each month from the round 1 interview date through the current interview date. Coding categories include unknown, not enrolled, in grades K to 12, on vacation, expelled, and other.

2. SCH_TERM
These variables report the respondent's school type and grade for each month in the time period. The first two digits represent the type of school (public = 10, private = 20, religious = 30 and unknown = 40). The last two digits provide the respondent's grade in school (1-12).

3. SCH_ID
This variable permits users to link array information to the school roster in the main data file and access other information about the school. The variable uses the same ID codes as the identification variable on the school roster in the main data set (for example, NEWSCHOOL_PUBID.01).

Monthly College Schooling Variables

Starting in round 2, four types of monthly arrays are created. Each array captures information for each month from the respondent's interview date in round 2 to present.

1. SCH_COLLEGE_STATUS
This array reports the respondent's enrollment status during each month from the round 2 interview date through the current interview date. Coding categories include unknown, not enrolled, in a two year college, in a four year college and in graduate school.

2. SCH_COLLEGE_TERM
These variables report the respondent's school type and grade for each month in the time period. The first two digits represent the type of school (public = 10, private = 20 and unknown = 40). The last two digits provide the respondent's term in college (1-98; 99 means no term information provided).

3. SCH_COLLEGE_ID
This variable permits users to link array information to the school roster in the main data file and access other information about the school. The variable uses the same ID codes as the identification variable on the school roster in the main data set (for example, NEWSCHOOL_PUBID.01).

4. SCH_COLLEGE_DEGREE
This variable shows what type of degree the respondent is trying to obtain.  The first two digits track if the respondent is going to college full-time (code = 1), part-time (code = 2) or their status is unknown (code = 3).  The last two digits provide the type of degree (1 = Associates; 3 = BA or BS; 4 = MA, MBA, MS; 5 = Ph.D.; 6 = MD, JD; 10 = Joint BA/MA; 40 = Unknown).