The design of an automatic analysis program for L2 text research: Necessity and feasibility

Abstract Several first and second language (L1 and L2) text researchers have recently utilized automatic analysis programs and computerized corpora to facilitate large-scale multivariate analyses of written discourse (e.g., Biber, 1988; Connor, 1990; Connor & Biber, 1989; Grabe, 1987; Grobe & Biber, 1987; Reid, 1990). Although it is clear that automated analyses moke important quantitative research much more feasible, there is a potential problem with applying computer programs to L2 texts: Many lexical and syntactic features of L2 writing are in varying developmental stages, and programs created to analyze L1 texts in “target” form may underestimate and/or mislabel structures in L2 writing. This article explores the necessity for and feasibility of the design of a computer program specifically for the analysis of L2 texts. Using data from a large L2 text analysis (160 texts; 62 variables) in which automatic analysis was not used, it is demonstrated that a program designed for L1 texts would not be accurate enough to capture completely the structures used by L2 writers. Following this analysis, suggestions are mode as to how an L2 text analysis program could be created and applied.