Robust and accurate features for detecting and diagnosing autism spectrum disorders

In this paper, we report experiments on the Interspeech 2013 Autism Challenge, which comprises of two subtasks ‐ detecting children with ASD and classifying them into four subtypes. We apply our recently developed algorithm to extract speech features that overcomes certain weaknesses of other currently available algorithms [1, 2]. From the input speech signal, we estimate the parameters of a harmonic model of the voiced speech for each frame including the fundamental frequency (f0). From the fundamental frequencies and the reconstructed noise-free signal, we compute other derived features such as Harmonicto-Noise Ratio (HNR), shimmer, and jitter. In previous work, we found that these features detect voiced segments and speech more accurately than other algorithms and that they are useful in rating the severity of a subject’s Parkinson’s disease [3]. Here, we employ these features, along with standard features such as energy, cepstral, and spectral features. With these features, we detect ASD using a regression and identify the sub-type using a classifier. We find that our features improve the performance, measured in terms of unweighted average recall (UAR), of detecting autism spectrum disorder by 2.3% and classifying the disorder into four categories by 2.8% over the baseline results. Index Terms: speech analysis, autism spectrum disorder

[1]  R. Hu Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) , 2003 .

[2]  Fabio Valente,et al.  The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism , 2013, INTERSPEECH.

[3]  Yael Adini,et al.  Abnormal Speech Spectrum and Increased Pitch Variability in Young Autistic Children , 2011, Front. Hum. Neurosci..

[4]  Yannis Stylianou,et al.  Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification , 1996 .

[5]  José Carlos Príncipe,et al.  2011 Ieee International Workshop on Machine Learning for Signal Processing an Adaptive Decoder from Spike Trains to Micro-stimulation Using Kernel Least-mean-squares (klms) , 2022 .

[6]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[7]  Janet B W Williams,et al.  Diagnostic and Statistical Manual of Mental Disorders , 2013 .

[8]  Simon J. Godsill,et al.  Bayesian harmonic models for musical pitch estimation and analysis , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Meysam Asgari,et al.  Robust detection of voiced segments in samples of everyday conversations using unsupervised HMMS , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[10]  Kathleen Hubbard,et al.  Intonation and Emotion in Autistic Spectrum Disorders , 2007, Journal of psycholinguistic research.

[11]  I. Shafran,et al.  Extracting cues from speech for predicting severity of Parkinson'S disease , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[12]  Fabien Ringeval,et al.  Automatic Intonation Recognition for the Prosodic Assessment of Language-Impaired Children , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Shlomo Dubnov,et al.  Maximum a-posteriori probability pitch tracking in noisy environments using harmonic model , 2004, IEEE Transactions on Speech and Audio Processing.