Language is Not About Language: Towards Formalizing the Role of Extra-Linguistic Factors in Human and Machine Language Acquisition and Communication

Despite the large research efforts in understanding early language acquisition (LA), it is still unclear how young children learn to transform their noisy and ambiguous auditory experience into a symbolic compositional representational system known as language. This paper argues that a major obstacle towards a more comprehensive picture of LA is the lack of a unified conceptual framework that would capture the full extent of factors critical to language learning in real world contexts, and that we should pursue such a framework in order to be able to place individual behavioral studies and computational models into a mutually compatible context. As an example of the issue, the widely used standard model of the speech chain—a description of the information flow from talker’s idea to listener’s interpretation of the meaning of the spoken message—is shown to be insufficient for characterizing learning and communication in natural contexts. Instead, a realistic model should account for the inherent multimodality and contextual dependency of communication and learning by formally acknowledging the role of a shared communicative context, interlocutors’ subjective representations of the shared situation, and how these factors drive message generation and speech perception in order to acquire information on the external world. By understanding how language is connected to the more generic sensorimotor and predictive processing principles of human cognition, we can also start to understand the core forces driving language learning in natural environments and through varying individual developmental trajectories.

[1]  Luc Steels,et al.  Language games for autonomous robots , 2001 .

[2]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[3]  Jerome A. Niles,et al.  The Context of Comprehension , 1981 .

[4]  Unto K. Laine,et al.  Computational language acquisition by statistical bottom-up processing , 2008, INTERSPEECH.

[5]  Hugo Van hamme,et al.  Fast vocabulary acquisition in an NMF-based self-learning vocal user interface , 2014, Comput. Speech Lang..

[6]  Minoru Asada,et al.  Infant-caregiver interactions affect the early development of vocalization , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[7]  Clément Moulin-Frier,et al.  COSMO ("Communicating about Objects using Sensory-Motor Operations"): A Bayesian modeling framework for studying speech communication and the emergence of phonological systems , 2015, J. Phonetics.

[8]  Okko Johannes Räsänen,et al.  An online model for vowel imitation learning , 2017, Speech Commun..

[9]  Dana H. Ballard,et al.  A multimodal learning interface for grounding spoken language in sensory perceptions , 2004, ACM Trans. Appl. Percept..

[10]  Alexandre Bernardino,et al.  Language Bootstrapping: Learning Word Meanings From Perception–Action Association , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Unto K. Laine,et al.  A method for noise-robust context-aware pattern discovery and recognition from categorical sequences , 2012, Pattern Recognit..

[12]  John J. Ohala,et al.  There is no interface between phonology and phonetics: a personal view , 1990 .

[13]  Sharon Goldwater,et al.  A role for the developing lexicon in phonetic category acquisition. , 2013, Psychological review.

[14]  Pierre-Yves Oudeyer,et al.  The Self-Organization of Speech Sounds , 2005, Journal of theoretical biology.

[15]  Karl J. Friston,et al.  Cortical circuits for perceptual inference , 2009, Neural Networks.

[16]  J. Katz,et al.  An integrated theory of linguistic descriptions , 1964 .

[17]  James R. Glass Towards unsupervised speech processing , 2012, 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA).

[18]  I. Howard,et al.  Modeling the development of pronunciation in infant speech acquisition. , 2011, Motor control.

[19]  R. Port,et al.  Against Formal Phonology , 2005 .

[20]  Luc Steels,et al.  Aibo''s first words. the social learning of language and meaning. Evolution of Communication , 2002 .

[21]  Hugo Van hamme,et al.  HAC-models: a novel approach to continuous speech recognition , 2008, INTERSPEECH.

[22]  Heikki Rasilo,et al.  A joint model of word segmentation and meaning acquisition through cross-situational learning. , 2015, Psychological review.

[23]  Okko Johannes Räsänen,et al.  Computational modeling of phonetic and lexical learning in early language acquisition: Existing models and future directions , 2012, Speech Commun..

[24]  Abdellah Fourtassi,et al.  A Rudimentary Lexicon and Semantics Help Bootstrap Phoneme Acquisition , 2014, CoNLL.

[25]  P. Denes,et al.  The speech chain : the physics and biology of spoken language , 1963 .

[26]  Stevan Harnad The Symbol Grounding Problem , 1999, ArXiv.

[27]  Marten van Schijndel,et al.  Salience and Attention in Surprisal-Based Accounts of Language Processing , 2016, Front. Psychol..

[28]  Kara D. Federmeier Thinking ahead: the role and roots of prediction in language comprehension. , 2007, Psychophysiology.

[29]  Guillaume Aimetti,et al.  Modelling Early Language Acquisition Skills: Towards a General Statistical Learning Mechanism , 2009, EACL.

[30]  Alex Pentland,et al.  Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[31]  Linda B. Smith,et al.  Statistical word learning at scale: the baby's view is better. , 2013, Developmental science.

[32]  Louis ten Bosch,et al.  A Computational Model of Language Acquisition: the Emergence of Words , 2009, Fundam. Informaticae.

[33]  Micha Elsner,et al.  A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability , 2013, EMNLP.

[34]  Pierre-Yves Oudeyer,et al.  Information-seeking, curiosity, and attention: computational and neural mechanisms , 2013, Trends in Cognitive Sciences.

[35]  Aren Jansen,et al.  The zero resource speech challenge 2017 , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[36]  P. Kuhl,et al.  Foreign-language experience in infancy: Effects of short-term exposure and social interaction on phonetic learning , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Richard M. Stern,et al.  Hearing Is Believing: Biologically Inspired Methods for Robust Automatic Speech Recognition , 2012, IEEE Signal Processing Magazine.