Towards Effective Integration and Positive Impact of Automated Writing Evaluation in L2 Writing

The increasing dominance of English has elevated the need to develop an ability to effectively communicate in writing, and this has put a strain on second language education programs worldwide. Faced with time-consuming and copious commenting on student drafts and inspired by the promise of computerized writing assessment, many "educational technology enthusiasts are looking to A WE [automated writing evaluation] as a silver bullet for language and literacy development" (Warschauer & Ware, 2006, p. 175). This chapter reviews what AWE offers for learners and teachers and raises a number of controversies regarding A WE effectiveness with the underlying message that clear milestone targets need to be set with respect to A WE development, implementation, and evaluation in order to ensure positive impact of this technology on L2 writing. In support of this message, the chapter introduces an examplelADE, a prototype of contextbased A WE conceptualized and operationalized to address latent issues through a synthesis of theoretical premises and learning needs. Multifaceted empirical evaluation of lADE further provides insights into processes triggered by interaction with A WE technology and foregrounds a call for future research needed to inform effective application of AWE in L2 writing classrooms. 1. Automated Writing Evaluation for Learning and Teaching Automated writing evaluation (AWE), defined as "the ability of computer technology to evaluate and score written prose" (Shermis & Burstein, 2003, p. xiii), is informed by educational measurement, computational linguistics as well as cognitive science and pedagogy. In other words, psychometric evaluations of reliability and validity, considerations about intelligent operational systems and their functionality as well as models that reflect thought processes and factors considered to be most beneficial for learners have all contributed to the development of AWE systems. 82 AUTOMATED WRITING EVALUATION This technology originated from automated scoring engines and was initially referred to as computerized essay scoring, computer essay grading, computer-assisted writing assessment, or machine scoring of essays. The first scoring system, Project Essay Gride (PEG), was developed in the 1960s and employed multiple regression analysis to predict scores based on measurable variables in the form of surface linguistic features (Page, 1994). The systems following PEG-Intelligent Essay Assessor (lEA), Electronic Essay Rater (e-rater), Conceptual Rater (c-rater), Schema Extract Analyze and Report (SEAR), Paperless School freetext Marking Engine (PS-ME), Automark, and AntMover-assess written constructed responses using natural language processing (NLpl) in combination with statistical techniques that analyze a wide range of aspects of writing constructs such as grammar, syntactic complexity, mechanics, style, topical content, content development, deviance, and so on. Before long, the systems were reconfigured to generate intelligent feedback on all these features. These systems led to the development of A WE products like the pioneering Writer's Workbench (MacDonald, Frase, Gingrich, & Keenan, 1982) and leadingedge programs like Criterion, MY Access! (for further information on this writing program see Ware & Rivas's chapter in this volume), WriteToLearn, Summary Street, and Holt Online Essay Assessor.3 These commercially available products are being increasingly used in writing classrooms, shifting the role of AWE from pure assessment to assistance for learning (Chen & Cheng, 2008; Warschauer & Grimes, 2008) by offering both automated feedback and a wide range of complementary tools and features intended to help students (Burstein, Chodorow & Leacock, 2004; Pearson Education, 2007; Vantage Learning, 2007). In Criterion, for example, the students can solicit and receive feedback from their teacher through the program's interface. This helps them to focus not only on automatically detectable errors, but also on other, more subtle, aspects of writing identified by the teacher. Students can also view their performance summary, which includes a holistic score, the number of errors, and links to detailed feedback on each error category. In addition, Criterion has a context-sensitive Writer's Handbook that provides additional definitions and lessons. To assist students in their planning process, this program also offers a •Make a plan' tool with a choice of eight templates for planning strategies. MY Access!, in turn, has an online writing coach which evaluates student writing and provides revision goals and remediation activities for each of the writing traits, as well as an editor which highlights errors and provides editing suggestions. It also offers a writer's checklist for guidance, scoring rubrics for self-assessment, word banks for appropriate vocabulary use, and graphical pre writing tools for better formulation and organization of ideas. WriteToLearn has similar options; plus, it allows students to hear the text in read1 See Landauer, Laham, and Foltz (2003), Attali and Burstein (2006), Burstein, Leacock, and Swartz (2001), Christie (1999), Mason and Grove-Stephenson (2002), Mitchell, Russell, Broornhead, and Aldridge (2002), and Anthony (2003), respectively. 2 NLP is a branch of Artificial Intelligence. 3 For comprehensive reviews of these automated scoring systems see Chapelle and Chung (2010), Dildi (2006), Phillips (2007), and Valenti, Neri, and Cucchiarelli (2003).

[1]  John M. Swales,et al.  Genre Analysis: English in Academic and Research Settings , 1993 .

[2]  M. Warschauer,et al.  Automated Writing Assessment in the Classroom , 2008 .

[3]  Semire Dikli,et al.  An Overview of Automated Scoring of Essays. , 2006 .

[4]  Yu Ren Dong,et al.  Non-native Graduate Students' Thesis/Dissertation Writing in Science: Self-reports by Students and Their Advisors from Two U.S. Institutions , 1998 .

[5]  Richard H. Haswell,et al.  Machine Scoring of Student Essays , 2006 .

[6]  Steve Graham,et al.  Strategy Instruction and the Teaching of Writing: A Meta-Analysis. , 2006 .

[7]  Rod Ellis,et al.  Principles of instructed language learning , 2005 .

[8]  John W. Creswell,et al.  Research Design: Qualitative, Quantitative, and Mixed Methods Approaches , 2010 .

[9]  Erik A. Borg,et al.  An ecological perspective on content-based instruction , 2005 .

[10]  Elena Cotos,et al.  Automatic Identification of Discourse Moves in Scientific Article Introductions , 2008 .

[11]  John R. Anderson Acquisition of cognitive skill. , 1982 .

[12]  Mary Kaye Jordan,et al.  The Role of Writing in Graduate Engineering Education: A Survey of Faculty Beliefs and Practices. , 1993 .

[13]  Tom Mitchell,et al.  Towards robust computerised marking of free-text responses , 2002 .

[14]  Hayes identifying the organization of wi iiing processes , 1980 .

[15]  Chi-Fen Emily Chen,et al.  Beyond the Design of Automated Writing Evaluation: Pedagogical Practices and Perceived Learning Effectiveness in EFL Writing Classes. , 2008 .

[16]  Michael H. Long The Role of the Linguistic Environment in Second Language Acquisition , 1996 .

[17]  Yuehchiu Fang,et al.  Perceptions of the Computer-Assisted Writing Program among EFL College Learners , 2010, J. Educ. Technol. Soc..

[18]  LeacockClaudia,et al.  Automated essay evaluation , 2004 .

[19]  Mark Warschauer,et al.  Automated writing evaluation: defining the classroom research agenda , 2006 .

[20]  Jill Burstein,et al.  Automated Essay Scoring : A Cross-disciplinary Perspective , 2003 .

[21]  Cathy Collins Block,et al.  Comprehension Instruction: Research-Based Best Practices , 2001 .

[22]  K. Hyland,et al.  Disciplinary Discourses: Social Interactions in Academic Writing , 2001 .

[23]  Salvatore Valenti,et al.  An Overview of Current Research on Automated Essay Grading , 2003, J. Inf. Technol. Educ..

[24]  Wenli Tsou The Effect of a Web-Based Writing Program in College English Writing Classes , 2008, 2008 Eighth IEEE International Conference on Advanced Learning Technologies.

[25]  Elena Cotos Designing an intelligent discourse evaluation tool: Theoretical, empirical, and technological considerations , 2009 .

[26]  J. Christie Automated essay marking - for style and content , 1999 .

[27]  Claudia Leacock,et al.  Automated evaluation of essays and short answers , 2001 .

[28]  Michael Halliday,et al.  Cohesion in English , 1976 .

[29]  Allen and Rosenbloom Paul S. Newell,et al.  Mechanisms of Skill Acquisition and the Law of Practice , 1993 .

[30]  Laurence Anthony,et al.  Mover: a machine learning tool to assist in the reading and writing of technical papers , 2003 .

[31]  O. Mason,et al.  Automated free text marking with Paperless School , 2002 .

[32]  J. Schroeder,et al.  The Impact of Criterion Writing Evaluation Technology on Criminal Justice Student Writing Skills , 2008 .

[33]  Elena Cotos,et al.  Potential of Automated Writing Evaluation Feedback. , 2011 .

[34]  Ian Blood Automated Essay Scoring: A Literature Review , 2011 .

[35]  Mark Warschauer,et al.  Utility in a Fallible Tool: A Multi-Site Case Study of Automated Writing Evaluation. , 2010 .