Errors and disfluencies in spoken corpora: Setting the scene

In this introduction, we wish to provide a broad overview of errors and disfluencies, showing how they are defined – and distinguished from each other – in the literature, and what impact the corpus revolution has had on the study of these phenomena. We will also demonstrate the usefulness of investigating such items by examining some of the possible applications of the study of errors and disfluencies.

[1]  Ute Römer,et al.  Trends in teenage talk , 2004 .

[2]  John Osborne,et al.  Adverb placement in post-intermediate learner English: a contrastive study of learner corpora , 2008 .

[3]  Judit Kormos,et al.  The Role of Attention in Monitoring Second Language Speech Production , 2000 .

[4]  Sandra Gotz Performanzph anomene in gesprochenem Lernerenglisch: eine korpusbasierte pilotstudie , 2007 .

[5]  M. M. Jagtman,et al.  Report- COMOLA: a computer system for the analysis of interlanguage data , 1994 .

[6]  Per Linell The written language bias in linguistics , 2004 .

[7]  猫田 英伸,et al.  Common European Framework of Reference for Languagesの意義を考える : 日本の英語教育関係者の連携のために , 2002 .

[8]  Manfred Pienemann,et al.  COALA-A computational system for interlanguage analysis , 1992 .

[9]  Liz Temple,et al.  Second language learner speech production , 2000 .

[10]  Gaëtanelle Gilquin,et al.  Hesitation markers among EFL learners: Pragmatic deficiency or difference? , 2008 .

[11]  Sylviane Granger The Louvain International Database of Spoken English Interlanguage (LINDSEI) Project , 1997 .

[12]  Sylviane Granger,et al.  Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching , 2002 .

[13]  G. Leech Grammars of spoken English: new outcomes of corpus-oriented research. , 2000 .

[14]  Birgit Henriksen,et al.  NATIVE SPEAKER REACTIONS TO LEARNERS' SPOKEN INTERLANGUAGE 1 , 1980 .

[15]  I. M. de Mönnink,et al.  Assessing the success rate of EFL Learner Corpus Tagging , 2001 .

[16]  Marnie Reed,et al.  He who Hesitates: Hesitation Phenomena as Quality Control in Speech Production, Obstacles in Non-Native Speech Perception , 2000 .

[17]  Anne Cutler Guest editorial: The reliability of speech error data , 1982 .

[18]  Jean E. Fox Tree,et al.  Pronouncing “the” as “thee” to signal problems in speaking , 1997, Cognition.

[19]  J. Holmes,et al.  You know, eh and other ‘exasperating expressions’: An analysis of social and stylistic variation in the use of pragmatic devices in a sample of New Zealand English , 1995 .

[20]  Hitoshi Isahara,et al.  From Learners' Corpora to Expert Knowledge Description: Analyzing Prepositions in the NICT JLE (Japanese Learner English) Corpus , 2004 .

[21]  N.H.J. Oostdijk Disfluencies in Spoken Language Data , 2001 .

[22]  Christoph Rühlemann Coming to terms with conversational grammar: ‘Dislocation’ and ‘dysfluency’ , 2006 .

[23]  Archibald A. Hill A THEORY OF SPEECH ERRORS , 1984 .

[24]  Patrick Wambacq,et al.  Handling Disfluencies in Spontaneous Language Models , 2002, CLIN.

[25]  P. Roberts,et al.  Disfluencies in non-stuttering adults across sample lengths and topics. , 2009, Journal of communication disorders.

[26]  John D. Gallagher,et al.  Review of “Longman Grammar of Spoken and Written English” by Douglas Biber et al. , 2000 .

[27]  R. Watts Taking the pitcher to the ‘well’: Native speakers' perception of their use of discourse markers in conversation , 1989 .

[28]  H. H. Clark,et al.  Psychology and language : an introduction to psycholinguistics , 1979 .

[29]  Joybrato Mukherjee The grammar of conversation in advanced spoken learner English: Learner corpus data and language-pedagogical implications , 2009 .

[30]  Rod Ellis,et al.  Analysing Learner Language , 2005 .

[31]  S. Brennan,et al.  THE FEELING OF ANOTHER'S KNOWING : PROSODY AND FILLED PAUSES AS CUES TO LISTENERS ABOUT THE METACOGNITIVE STATES OF SPEAKERS , 1995 .

[32]  Eli Hinkel,et al.  Ten Criteria for a Spoken Grammar , 2001 .

[33]  P. Lennon Investigating Fluency in EFL: A Quantitative Approach* , 1990 .

[34]  Caroline L. Rieger Disfluencies and hesitation strategies in oral L2 tests , 2003, DiSS.

[35]  Roger Griffiths,et al.  Pausological Research in an L2 Context: A Rationale, and Review of Selected Studies , 1991 .

[36]  Kari Tenfjord,et al.  The "Hows" and the "Whys" of Coding Categories in a Learner Corpus (or "How and Why an Error-Tagged Learner Corpus is not 'ipso facto' One Big Comparative Fallacy") , 2006 .

[37]  J. E. Tree The Effects of False Starts and Repetitions on the Processing of Subsequent Words in Spontaneous Speech , 1995 .

[38]  N. Shepherd Cambridge Grammar of English , 2007 .

[39]  Robbert-Jan Beun,et al.  Filled pauses as markers of discourse structure , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[40]  E. Finegan,et al.  English discourse particles : Evidence from a corpus , 2012 .

[41]  M. Obaidul Hamid Identifying second language errors: how plausible are plausible reconstructions? , 2007 .

[42]  K Bock,et al.  That’s the way the cookie bounces: Syntactic and semantic components of experimentally elicited idiom blendsß , 1997, Memory & cognition.

[43]  Simone Müller ‘Wellyouknowthattypeofperson’: functions of well in the speech of American and German students , 2004 .

[44]  Abe Mariko,et al.  Grammatical Errors across Proficiency Levels in L2 Spoken and Written English , 2007 .

[45]  Sylviane Granger,et al.  A Bird’s-eye view of learner corpus research , 2002 .

[46]  Patrick Paroubek,et al.  A quantitative study of disfluencies in French broadcast interviews , 2005, DiSS.

[47]  S. P. Corder THE SIGNIFICANCE OF LEARNER'S ERRORS , 1967 .

[48]  Andreas Stolcke,et al.  Statistical language modeling for speech disfluencies , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[49]  Nanda Poulisse,et al.  Slips of the tongue in first and second language production , 2000 .

[50]  G. Kjellmer Hesitation. In Defence of ER and ERM , 2003 .

[51]  Joakim Nivre,et al.  Speech Management—on the Non-written Life of Speech , 1990, Nordic Journal of Linguistics.

[52]  K. Sadeghi,et al.  Collocational Differences Between L1 and L2: Implications for EFL Learners and Teachers , 2009 .

[53]  Tracey M. Derwing,et al.  Second Language Fluency: Judgments on Different Tasks , 2004 .

[54]  Sylviane Granger,et al.  Towards a reconciliation of a Can Do and Can't Do approach to language assessment , 2005 .

[55]  Sylviane Granger,et al.  Learner corpora: The missing link in EAP pedagogy , 2007 .

[56]  Emi Izumia,et al.  SST speech corpus of Japanese learners ’ English and automatic detection of learners ’ errors , 2004 .

[57]  Nadja Nesselhauf,et al.  Collocations in a Learner Corpus , 2005 .

[58]  Rod Ellis,et al.  The Study of Second Language Acquisition , 1994 .

[59]  Edgar D. Randolph Conventional Aversions versus Fundamental Errors in Spoken English , 1917 .

[60]  D. O’connell,et al.  Uh and Um Revisited: Are They Interjections for Signaling Delay? , 2005, Journal of psycholinguistic research.

[61]  Douglas Biber,et al.  Variation across speech and writing: Methodology , 1988 .

[62]  Lorenzo García-Amaya New Findings on Fluency Measuresacross Three Different Learning Contexts , 2009 .

[63]  Sylviane Granger,et al.  Computer-Aided Error Analysis. , 1998 .

[64]  Judit Kormos,et al.  Exploring measures and perceptions of fluency in the speech of second language learners. , 2004 .

[65]  D. Larsen-Freeman The Emergence of Complexity, Fluency, and Accuracy in the Oral and Written Production of Five Chinese Learners of English , 2006 .

[66]  D. O’connell,et al.  The History of Research on the Filled Pause as Evidence of The Written Language Bias in Linguistics (Linell, 1982) , 2004, Journal of psycholinguistic research.

[67]  P. Hopper Grammatical constructions and their discourse origins: prototype or family resemblance? , 2001 .

[68]  Sandra Mollin,et al.  The Hansard hazard: gauging the accuracy of British parliamentary transcripts1 , 2007 .

[69]  Daniel C. O'Connell,et al.  How do transcribers deal with audio recordings of spoken discourse? , 1995 .

[70]  Nigel D. Turton,et al.  Longman dictionary of common errors , 1987 .

[71]  Joseph Paul Stemberger,et al.  Preventing perseveration in language production , 2009 .

[72]  Jennifer Thewissen The determinants of error status: Reframing the construct of error : Paper presented at the workshop on Errors and Disfluencies in Spoken Corpora , 2009 .

[73]  Neal R. Norrick Using large corpora of conversation to investigate narrative: the case of interjections in conversational storytelling performance , 2008 .

[74]  Francine Chambers,et al.  What do we mean by fluency , 1997 .

[75]  Elizabeth Coppock,et al.  Parallel grammatical encoding in sentence production: Evidence from syntactic blends , 2010 .

[76]  Sylviane Granger,et al.  Tag sequences in learner corpora: a key to interlanguage grammar and discourse , 1998 .

[77]  Sylviane Granger,et al.  The International Corpus of Learner English. Handbook and CD-ROM , 2002 .

[78]  David Livert,et al.  English past tense use as a clinical marker in older bilingual children with language impairment , 2010, Clinical linguistics & phonetics.

[79]  Daniel C. O'Connell,et al.  Prospectus for a science of pausology , 1980 .

[80]  Karin Aijmer,et al.  English discourse particles , 2002 .

[81]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[82]  Roger Garside,et al.  A hybrid grammatical tagger: CLAWS4 , 1997 .

[83]  Nanda Poulisse,et al.  Slips of the Tongue: Speech Errors in First and Second Language Production , 1999 .

[84]  Anke Lüdeling,et al.  Multi-level error annotation in learner corpora , 2005 .

[85]  Wallace L. Chafe,et al.  Integration and Involvement in Spoken and Written Language , 1984 .

[86]  Leyla Hasbún Hasbún,et al.  Fossilization and Acquisition: A Study of Learner Language , 2008 .

[87]  A. Dister De la transcription à l'étiquetage morphosyntaxique : le cas de la banque de données textuelles orales Valibel , 2007 .

[88]  Per Linell The Written Language Bias in Linguistics : Its Nature, Origins and Transformations , 2005 .

[89]  Michael Rundell,et al.  The corpus revolution , 1992 .

[90]  飯島 周 「会話の文法」に関する一考察 : Longman Grammar of Spoken and Written Englishの場合 , 1999 .

[91]  Marsal Gavaldà SOUP: A Parser for Real-world Spontaneous Speech , 2000, IWPT.

[92]  S. Reeves,et al.  Discourse Analysis , 2018, Understanding Communication Research Methods.

[93]  Carl James,et al.  Errors in Language Learning and Use: Exploring Error Analysis , 1998 .

[94]  Karalyn Patterson,et al.  Making sense of progressive non-fluent aphasia: an analysis of conversational speech. , 2009, Brain : a journal of neurology.

[95]  Stefan Th. Gries,et al.  Corpora and experimental methods: A state-of-the-art review , 2009 .

[96]  Angela Hasselgren,et al.  Learner corpora and language testing , 2002 .

[97]  H. H. Clark,et al.  Using uh and um in spontaneous speaking , 2002, Cognition.

[98]  J. Gee,et al.  Speech errors in progressive non-fluent aphasia , 2010, Brain and Language.

[99]  Nadine Martin,et al.  Phonological Facilitation of Semantic Errors in Normal and Aphasic Speakers , 1996 .

[100]  R. Ferber Slip of the tongue or slip of the ear? On the perception and transcription of naturalistic slips of the tongue , 1991, Journal of psycholinguistic research.

[101]  Sylvie De Cock,et al.  Preferred sequences of words in NS and NNS speech , 2004 .

[102]  Bertus van Rooy,et al.  The effect of learner errors on POS tag errors during automatic POS tagging , 2002 .

[103]  Matthew Saxton,et al.  Negative evidence and negative feedback: immediate effects on the grammaticality of child speech , 2000 .

[104]  C. Fillmore 5 – On Fluency , 1979 .

[105]  J. Harmer Macmillan English Dictionary for Advanced Learners , 2002 .

[106]  Stanley Feldstein,et al.  Of speech and time : temporal speech patterns in interpersonal contexts , 1981 .

[107]  Jesús Romero Trillo Your attention, please: Pragmatic mechanisms to obtain the addressee's attention in English and Spanish conversations☆ , 1997 .

[108]  Douglas Biber Variation across speech and writing: Variation across Speech and Writing , 1988 .

[109]  W. Levelt,et al.  Speaking: From Intention to Articulation , 1990 .

[110]  S. Brennan,et al.  Disfluency Rates in Conversation: Effects of Age, Relationship, Topic, Role, and Gender , 2001, Language and speech.

[111]  Judit Kormos Monitoring and Self‐Repair in L2 , 1999 .

[112]  Michael Stubbs,et al.  Discourse Analysis: The Sociolinguistic Analysis of Natural Language , 1983 .

[113]  Joybrato Mukherjee Speech is Silver, but Silence is Golden: Some Remarks on the Function(s) of Pauses , 2001 .

[114]  C. Osgood,et al.  Hesitation Phenomena in Spontaneous English Speech , 1959 .

[115]  Emi Izumi,et al.  Investigation into Language Learners' Acquisition Order Based on an Error Analysis of a Learner Corpus , 2004 .

[116]  Sylviane Granger,et al.  Prefabricated patterns in advanced EFL writing: collocations and formulae , 1998 .

[117]  Daniel C. O'Connell,et al.  A Note on Time, Timing, and Transcriptions Thereof. , 1990 .

[118]  Håkan Ringbom,et al.  The role of the first language in foreign language learning , 1990 .

[119]  Sylviane Granger,et al.  Automatic Lexical Profiling of Learner Texts , 1998 .

[120]  Kim Kirsner,et al.  Fluency: Time for a Paradigm Shift , 2003, DiSS.