Coding Major Fields of Study.

The National Center for Education Statistics conducts surveys which require the coding of the respondent's major field of study. This paper presents a new system for the coding of major field of study. It operates on-line in a CATI environment and allows conversational checks to verify coding directly from the respondent. The system "learns" by maintaining a database of response/coding pairs which can be incorporated into its algorithm after supervisor review. This paper analyzes the effectiveness of this approach and database in coding major field of study for the Beginning Postsecondary Students Longitudinal Study Second Followup 1990-1994. Introduction sometimes resulted in expensive call-back procedures. Furthermore, researchers were frequently critical of the 6-digit and/or 2-digit codings. Simply put, the 6-digit codes were too complex for analyses and the 2-digit codes did not provide adequate detail for analyses. One of the major critics developed an alternative 3-digit system with 111 possible codes based on patterns of courses described within A College Course Map: Taxonomy and Transcript Data. Finally, as researchers become more sophisticated users of text string data (or as software improves handling of strings), the value of the high quality text strings becomes paramount in the data collection systems, including CATI. Coding becomes primarily a key for sorting or subsetting collections of text strings. NCES projects have frequently collected data concerning major field of study. Until recently, two procedures have been used to gather this data. First, experts in the Classification of Instructional Programs (CIP) examined respondents' text strings and assigned 6-digit codes from about 1,400 possibilities. Second, respondents selected 2-digit codes from a list of about 35 possibilities. The correspondence between the 2-digit and 6-digit codes was artificially high because the expert coders viewed both the text strings and the respondents' 2-digit selections. Inter-rater reliability within the expert pool was typically in the .80-.90 range after 3-4 days of training. The advent of Computer Assisted Telephone Interviewing (CATI) enhanced the capability for obtaining higher quality text strings describing respondents' major fields of study. However, online codings into the 2-digits proved very time consuming. In addition, post-coding of text strings into CIP 6-digit codes delayed file delivery and Methodology for Data Collection All of the factors outlined above contributed to the development of a new approach for coding major field of study in current NCES CATI projects. The new coding approach incorporates on-line coding into our existing CATI system, using a 3digit classification system. The existing CATI system executes the NCES major field of study coding software. The coding software then takes over all CATI functions for the major field of study question, and returns a response string and a 3-digit code to the existing CATI system. The CATI system then stores this data, and proceeds with the next question for the respondent. The coding software takes care of prompting the CATI operator throughout the coding session. Initially, the respondent is asked an open question, "What is your major field of study", and the respondent's reply is entered into the coding