Speakers enhance contextually confusable words

Recent work has found evidence that natural languages are shaped by pressures for efficient communication — e.g. the more contextually predictable a word is, the fewer speech sounds or syllables it has (Piantadosi et al. 2011). Research on the degree to which speech and language are shaped by pressures for effective communication — robustness in the face of noise and uncertainty — has been more equivocal. We develop a measure of contextual confusability during word recognition based on psychoacoustic data. Applying this measure to naturalistic speech corpora, we find evidence suggesting that speakers alter their productions to make contextually more confusable words easier to understand.

[1]  Jeffrey M. Woodbridge Econometric Analysis of Cross Section and Panel Data , 2002 .

[2]  Michael S Vitevitch,et al.  The influence of phonological similarity neighborhoods on speech production. , 2002, Journal of experimental psychology. Learning, memory, and cognition.

[3]  Steven T. Piantadosi,et al.  The communicative function of ambiguity in language , 2011, Cognition.

[4]  Florien J. van Beinum,et al.  Efficiency as an organizing principle of natural speech , 1998, ICSLP.

[5]  Christopher D. Manning,et al.  Probabilistic models of word order and syntactic discontinuity , 2005 .

[6]  P. Luce,et al.  Phonological Neighborhood Effects in Spoken Word Perception and Production , 2016 .

[7]  Andreas Stolcke,et al.  SRILM at Sixteen: Update and Outlook , 2011 .

[8]  Esteban Buz,et al.  Dynamic hyperarticulation of coda voicing contrasts. , 2016, The Journal of the Acoustical Society of America.

[9]  S. Piantadosi,et al.  Info/information theory: Speakers choose shorter words in predictive contexts , 2013, Cognition.

[10]  Vera Demberg,et al.  Syntactic Surprisal Affects Spoken Word Duration in Conversational Contexts , 2012, EMNLP.

[11]  Anne Christophe,et al.  Words cluster phonetically beyond phonotactic regularities , 2017, Cognition.

[12]  Elizabeth Hume,et al.  Nasal place assimilation trades off inferrability of both target and trigger words , 2018, Laboratory Phonology: Journal of the Association for Laboratory Phonology.

[13]  Sharon Goldwater,et al.  Talkers account for listener and channel characteristics to communicate efficiently , 2015 .

[14]  Keith Johnson,et al.  Why reduce? Phonological neighborhood density and phonetic reduction in spontaneous speech , 2012 .

[15]  Louis C. W. Pols,et al.  How efficient is speech , 2003 .

[16]  D. Norris,et al.  Shortlist B: a Bayesian model of continuous speech recognition. , 2008, Psychological review.

[17]  Anne Cutler,et al.  Phonological and statistical effects on timing of speech perception: Insights from a database of Dutch diphone perception , 2005, Speech Commun..

[18]  T. Florian Jaeger,et al.  Signal Reduction and Linguistic Encoding , 2017 .

[19]  Uriel Cohen Priva Informativity affects consonant duration and deletion rates , 2015 .

[20]  R. Schiffer Psychobiology of Language , 1986 .

[21]  M. Picheny,et al.  Speaking clearly for the hard of hearing. II: Acoustic characteristics of clear and conversational speech. , 1986, Journal of speech and hearing research.

[22]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[23]  F Grosjean,et al.  Spoken word recognition processes and the gating paradigm , 1980, Perception & psychophysics.

[24]  Jason M. Brenier,et al.  Predictability Effects on Durations of Content and Function Words in Conversational English , 2009 .

[25]  Scott Seyfarth,et al.  Word informativity influences acoustic duration: Effects of contextual predictability on lexical representation , 2014, Cognition.

[26]  Anne Cutler,et al.  Unfolding of phonetic information over time: a database of Dutch diphone perception. , 2003, The Journal of the Acoustical Society of America.

[27]  M. Tanenhaus,et al.  Dynamically adapted context-specific hyper-articulation: Feedback from interlocutors affects speakers' subsequent pronunciations. , 2016, Journal of memory and language.

[28]  Anne Cutler,et al.  Tracking perception of the sounds of English. , 2014, The Journal of the Acoustical Society of America.

[29]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[30]  Roger Levy,et al.  A noisy-channel model of rational human sentence comprehension under uncertain input , 2008, EMNLP 2008.

[31]  John R. Anderson Is human cognition adaptive? , 1991, Behavioral and Brain Sciences.

[32]  D. Mirman,et al.  Competition and cooperation among similar representations: toward a unified account of facilitative and inhibitory effects of lexical neighbors. , 2012, Psychological review.

[33]  William D. Raymond,et al.  Probabilistic Relations between Words: Evidence from Reduction in Lexical Production , 2008 .

[34]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[35]  Jessamyn Schertz,et al.  Exaggeration of featural contrasts in clarifications of misheard speech in English , 2013, J. Phonetics.

[36]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[37]  G S Dell,et al.  A spreading-activation theory of retrieval in sentence production. , 1986, Psychological review.

[38]  J. Ohala Papers in Laboratory Phonology: The phonetics and phonology of aspects of assimilation , 1990 .

[39]  David Miller,et al.  The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text , 2004, LREC.

[40]  George Kingsley Zipf,et al.  The Psychobiology of Language , 2022 .

[41]  Yuen Ren Chao,et al.  Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology , 1950 .

[42]  Andrew Gelman,et al.  Fitting Multilevel Models When Predictors and Group Effects Correlate , 2007 .

[43]  Alice Turk,et al.  The Smooth Signal Redundancy Hypothesis: A Functional Explanation for Relationships between Redundancy, Prosodic Prominence, and Duration in Spontaneous Speech , 2004, Language and speech.

[44]  Kathleen Currie Hall,et al.  The role of predictability in shaping phonological patterns , 2018, Linguistics Vanguard.

[45]  Julia F. Strand,et al.  Many neighborhoods: Phonological and perceptual neighborhood density in lexical production and perception , 2016 .

[46]  R. Wright Phonetically Based Phonology: A review of perceptual cues and cue robustness , 2004 .

[47]  Christo Kirov,et al.  The Specificity of Online Variation in Speech Production , 2012, CogSci.

[48]  Mark Steedman,et al.  The NXT-format Switchboard Corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue , 2010, Lang. Resour. Evaluation.

[49]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[50]  John R. Anderson The Adaptive Character of Thought , 1990 .

[51]  Uriel Cohen Priva Using Information Content to PredictPhone Deletion , 2008 .

[52]  Steven T Piantadosi,et al.  Word lengths are optimized for efficient communication , 2011, Proceedings of the National Academy of Sciences.

[53]  R. Levy Expectation-based syntactic comprehension , 2008, Cognition.

[54]  William D. Raymond,et al.  The Buckeye corpus of conversational speech: labeling conventions and a test of transcriber reliability , 2005, Speech Commun..

[55]  M. Aylett,et al.  Language redundancy predicts syllabic duration and the spectral characteristics of vocalic syllable nuclei. , 2006, The Journal of the Acoustical Society of America.