Verbal Protest Recognition in Children with Autism

Real-time detection of verbal protest (sensory overload-induced crying) in children with autism is a first step towards understanding the precursors of challenging behaviors associated with autism. Detection of verbal protest is useful for both autism researchers interested in exploring just-in-time intervention techniques and researchers interested in audio event detection in routine living environments. In this paper, we examine, adapt, and improve upon two techniques for verbal protest recognition and tailor them for children with autism spectrum disorder (ASD). The first technique investigated is a Gaussian Mixture Model (GMM) with stacking. The second technique uses Convolutional Neural Networks (CNN) trained on log Mel-filter banks (LMFB). We proceed to examine accuracy with a focus on real-world false positive rates and minimization of dataset biases through the introduction of noise and input perturbation.

[1]  John H. L. Hansen,et al.  Analysis and identification of human scream: implications for speaker recognition , 2014, INTERSPEECH.

[2]  Horia Cucu,et al.  Automatic methods for infant cry classification , 2016, 2016 International Conference on Communications (COMM).

[3]  Ling Guan,et al.  Recognizing Human Emotional State From Audiovisual Signals , 2008, IEEE Transactions on Multimedia.

[4]  Aren Jansen,et al.  Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Justin Salamon,et al.  Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.

[6]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[7]  Paavo Alku,et al.  Detection of Shouted Speech in the Presence of Ambient Noise , 2011, INTERSPEECH.

[8]  Rami Cohen,et al.  Baby cry detection in domestic environment using deep learning , 2016, 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE).

[9]  Justin Salamon,et al.  A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.

[10]  Luc Van Gool,et al.  Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection , 2016, ArXiv.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.