A Knowledge Driven Structural Segmentation Approach for Play-Talk Classification During Autism Assessment

Automatically segmenting conversational audio into semantically relevant components has both computational and analytical significance. In this paper, we segment play activities and conversational portions interspersed during clinicallyadministered interactions between a psychologist and a child with autism spectrum disorder (ASD). We show that various acoustic-prosodic and turn-taking features commonly used in the literature differ between these segments, and hence can possibly influence further inference tasks. We adopt a two-step approach for the segmentation problem by taking advantage of the structural relation between the two segments. First, we use a supervised machine learning algorithm to estimate class posteriors at frame-level. Next, we use an explicit-duration hidden Markov model (EDHMM) to align the states using the posteriors from the previous step. The durational distributions for both play and talk regions are learnt from training data and modeled using the EDHMM. Our results show that speech features can be used to successfully discriminate between play and talk activities, each providing important insights into the child’s condition.

[1]  G. Dawson,et al.  Early Predictors of Communication Development in Young Children with Autism Spectrum Disorder: Joint Attention, Imitation, and Toy Play , 2006, Journal of autism and developmental disorders.

[2]  Rahul Gupta,et al.  Acoustic-Prosodic and Turn-Taking Features in Interactions with Children with Neurodevelopmental Disorders , 2016, INTERSPEECH.

[3]  M. Sigman,et al.  Symbolic play and language comprehension in autistic children. , 1981, Journal of the American Academy of Child Psychiatry.

[4]  Stephen E. Levinson,et al.  Large vocabulary speech recognition using a hidden Markov model for acoustic/phonetic classification , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  G. Celani,et al.  The Understanding of the Emotional Meaning of Facial Expressions in People with Autism , 1999, Journal of autism and developmental disorders.

[6]  C. Lord,et al.  Austism diagnostic observation schedule: A standardized observation of communicative and social behavior , 1989, Journal of autism and developmental disorders.

[7]  Stephen E. Levinson,et al.  Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition , 1989, HLT.

[8]  Natacha Akshoomoff,et al.  The Role of the Autism Diagnostic Observation Schedule in the Assessment of Autism Spectrum Disorders in School and Community Settings , 2006, The California school psychologist : CASP.

[9]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[10]  Agata Rozga,et al.  Increased Eye Contact During Conversation Compared to Play in Children With Autism , 2017, Journal of autism and developmental disorders.

[11]  Insu Song,et al.  Using Diagnostic Information to Develop a Machine Learning Application for the Effective Screening of Autism Spectrum Disorders , 2014 .

[12]  Shrikanth S. Narayanan,et al.  Acoustic-prosodic correlates of 'awkward' prosody in story retellings from adolescents with autism , 2015, INTERSPEECH.

[13]  Shrikanth S. Narayanan,et al.  The psychologist as an interlocutor in autism spectrum disorder assessment: insights from a study of spontaneous prosody. , 2014, Journal of speech, language, and hearing research : JSLHR.

[14]  Matthew S. Goodwin,et al.  Applying Machine Learning to Facilitate Autism Diagnostics: Pitfalls and Promises , 2014, Journal of Autism and Developmental Disorders.

[15]  Shunzheng Yu,et al.  Hidden semi-Markov models , 2010, Artif. Intell..

[16]  C. Lord,et al.  Standardizing ADOS Scores for a Measure of Severity in Autism Spectrum Disorders , 2009, Journal of autism and developmental disorders.

[17]  F. Volkmar,et al.  Brief Report: Relations between Prosodic Performance and Communication and Socialization Ratings in High Functioning Speakers with Autism Spectrum Disorders , 2005, Journal of autism and developmental disorders.

[18]  Rahul Gupta,et al.  Objective Language Feature Analysis in Children with Neurodevelopmental Disorders During Autism Assessment , 2016, INTERSPEECH.

[19]  C. Mazefsky,et al.  The discriminative ability and diagnostic utility of the ADOS-G, ADI-R, and GARS for children in a clinical setting , 2006, Autism : the international journal of research and practice.

[20]  Andrew Pickles,et al.  Measuring Changes in Social Communication Behaviors: Preliminary Development of the Brief Observation of Social Communication Change (BOSCC) , 2016, Journal of autism and developmental disorders.

[21]  Z. Warren,et al.  Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years — Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2014 , 2018, Morbidity and mortality weekly report. Surveillance summaries.

[22]  Shrikanth S. Narayanan,et al.  Acoustic-prosodic, turn-taking, and language cues in child-psychologist interactions for varying social demand , 2013, INTERSPEECH.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Sue Peppé,et al.  Receptive and expressive prosodic ability in children with high-functioning autism. , 2007, Journal of speech, language, and hearing research : JSLHR.

[25]  John Amato,et al.  Symbolic Play Behavior in Very Young Verbal and Nonverbal Children with Autism. , 1999 .