Automated Transcription and Conversation Analysis

This article explores the potential of automated transcription technology for use in Conversation Analysis (CA). First, it applies auto-transcription to a classic CA recording and compares the output with Gail Jefferson’s original transcript. Second, it applies auto-transcription to more recent recordings to demonstrate transcript quality under ideal conditions. And third, it examines the use of auto-transcripts for navigating big conversational data sets. The article concludes that although standard automated transcription technology lacks certain critical capabilities and exhibits varying levels of accuracy, it may still be useful for (a) providing first-pass transcripts, with silences, for further manual editing; and (b) scaling up data exploration and collection building by providing time-based indices requiring no manual effort to generate. Data are in American English.

[1]  G. Jefferson Glossary of transcript symbols with an introduction , 2004 .

[2]  Gary C. David,et al.  Listening to what is said--transcribing what is heard: the impact of speech recognition technology (SRT) on the practice of medical transcription (MT). , 2009, Sociology of health & illness.

[3]  Alexa Hepburn,et al.  The Conversation Analytic Approach to Transcription , 2012 .

[4]  Martin Cooke,et al.  A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.

[5]  Emanuel A. Schegloff,et al.  Confirming Allusions: Toward an Empirical Account of Action , 1996, American Journal of Sociology.

[6]  Andreas Stolcke,et al.  Enriching speech recognition with automatic detection of sentence boundaries and disfluencies , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Gareth Walker Phonetics and Prosody in Conversation , 2012 .

[8]  Brian Kingsbury,et al.  The IBM Attila speech recognition toolkit , 2010, 2010 IEEE Spoken Language Technology Workshop.

[9]  J. Sidnell,et al.  The Handbook of Conversation Analysis: Sidnell/The Handbook of Conversation Analysis , 2012 .

[10]  Elizabeth Shriberg,et al.  Spontaneous speech: how people really talk and why engineers should care , 2005, INTERSPEECH.

[11]  P. Ladefoged A course in phonetics , 1975 .

[12]  Anita M. Pomerantz Agreeing and disagreeing with assessments: some features of preferred/dispreferred turn shapes , 1984 .

[13]  Jeffrey D. Robinson,et al.  Interobserver Agreement on First-Stage Conversation Analytic Transcription , 2004 .

[14]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[15]  Bernard J. Frieden,et al.  Notes on Methodology , 2020, Immigrant Incorporation in East Asian Democracies.

[16]  Mary Tai Knox Speaker Diarization: Current Limitations and New Directions , 2013 .

[17]  Brian Kingsbury,et al.  Trends and advances in speech recognition , 2011, IBM J. Res. Dev..

[18]  J. Atkinson,et al.  A change-of-state token and aspects of its sequential placement , 1985 .

[19]  P. Drew Structures of Social Action: Speakers' reportings in invitation sequences , 1985 .