Automatic segmentation of infant cry signals using hidden Markov models

Automatic extraction of acoustic regions of interest from recordings captured in realistic clinical environments is a necessary preprocessing step in any cry analysis system. In this study, we propose a hidden Markov model (HMM) based audio segmentation method to identify the relevant acoustic parts of the cry signal (i.e., expiratory and inspiratory phases) from recordings made in natural environments with various interfering acoustic sources. We examine and optimize the performance of the system by using different audio features and HMM topologies. In particular, we propose using fundamental frequency and aperiodicity features. We also propose a method for adapting the segmentation system trained on acoustic material captured in a particular acoustic environment to a different acoustic environment by using feature normalization and semi-supervised learning (SSL). The performance of the system was evaluated by analyzing a total of 3 h and 10 min of audio material from 109 infants, captured in a variety of recording conditions in hospital wards and clinics. The proposed system yields frame-based accuracy up to 89.2%. We conclude that the proposed system offers a solution for automated segmentation of cry signals in cry analysis applications.

[1]  K. Wermke,et al.  Fundamental frequency of neonatal crying: does body size matter? , 2010, Journal of voice : official journal of the Voice Foundation.

[2]  B. Lester,et al.  Developmental outcome prediction from acoustic cry analysis in term and preterm infants. , 1987, Pediatrics.

[3]  Leonardo Bocchi,et al.  Automatic newborn cry analysis: A Non-invasive tool to help autism early diagnosis , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[4]  Xiaoxue Li,et al.  A flexible analysis tool for the quantitative acoustic assessment of infant cry. , 2013, Journal of speech, language, and hearing research : JSLHR.

[5]  K. Pape,et al.  Developmental and maladaptive plasticity in neonatal SCI , 2012, Clinical Neurology and Neurosurgery.

[6]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[7]  G. Cioni,et al.  Acquired focal brain lesions in childhood: Effects on development and reorganization of language , 2008, Brain and Language.

[8]  P. Hill,et al.  A neurobiological model for cry-fuss problems in the first three to four months of life. , 2013, Medical hypotheses.

[9]  Riitta Parkkola,et al.  Acoustic quality of cry in very-low-birth-weight infants at the age of 1 1/2 years. , 2007, Early human development.

[10]  K. Michelsson,et al.  Phonation in the newborn, infant cry. , 1999, International journal of pediatric otorhinolaryngology.

[11]  Christian Gargour,et al.  Expiratory and Inspiratory Cries Detection Using Different Signals' Decomposition Techniques , 2017, Journal of voice : official journal of the Voice Foundation.

[12]  H L Golub,et al.  Infant cry: a clue to diagnosis. , 1982, Pediatrics.

[13]  Chakib Tadj,et al.  Automatic detection of the expiratory and inspiratory phases in newborn cry signals , 2015, Biomed. Signal Process. Control..

[14]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[15]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[16]  Jiucang Hao,et al.  Emotion recognition by speech signals , 2003, INTERSPEECH.

[17]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[18]  C Manfredi,et al.  A comparative analysis of fundamental frequency estimation methods with application to pathological voices. , 2000, Medical engineering & physics.

[19]  Flavio Cunha,et al.  The Economics and Psychology of Inequality and Human Development , 2009, Journal of the European Economic Association.

[20]  Anete Branco,et al.  The newborn pain cry: descriptive acoustic spectrographic analysis. , 2007, International journal of pediatric otorhinolaryngology.

[21]  Hideki Kawahara,et al.  Comparative evaluation of F0 estimation algorithms , 2001, INTERSPEECH.

[22]  M Peucker,et al.  Newborn acoustic cry characteristics of infants subsequently dying of sudden infant death syndrome. , 1995, Pediatrics.

[23]  Chakib Tadj,et al.  A Cry-Based Babies Identification System , 2010, ICISP.

[24]  K. Grewen,et al.  Translational Analysis of Effects of Prenatal Cocaine Exposure on Human Infant Cries and Rat Pup Ultrasonic Vocalizations , 2014, PloS one.

[25]  Kazuo Okanoya,et al.  Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models. , 2011, The Journal of the Acoustical Society of America.

[26]  M Robb,et al.  Acoustic examination of preterm and full-term infant cries: the long-time average spectrum. , 1999, Journal of speech, language, and hearing research : JSLHR.

[27]  Zoltán Benyó,et al.  Automatic infant cry detection , 2009, MAVEBA.

[28]  Claudia Manfredi,et al.  Effective pre-processing of long term noisy audio recordings: An aid to clinical monitoring , 2013, Biomed. Signal Process. Control..

[29]  L. Lagasse,et al.  Assessment of infant cry: acoustic cry analysis and parental perception. , 2005, Mental retardation and developmental disabilities research reviews.

[30]  A. Cohen,et al.  On the use of hidden Markov models in infants' cry classification , 2002, The 22nd Convention on Electrical and Electronics Engineers in Israel, 2002..

[31]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[32]  Mijna Hadders-Algra,et al.  Challenges and limitations in early intervention , 2011, Developmental medicine and child neurology.

[33]  Steve Young,et al.  The HTK hidden Markov model toolkit: design and philosophy , 1993 .

[34]  S. Porges,et al.  Newborn pain cries and vagal tone: parallel changes in response to circumcision. , 1988, Child development.

[35]  M Robb,et al.  Acoustic correlates of inspiratory phonation during infant cry. , 1995, Journal of speech and hearing research.

[36]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[37]  Dror Lederman,et al.  Classification of cries of infants with cleft-palate using parallel hidden Markov models , 2008, Medical & Biological Engineering & Computing.

[38]  Douglas A. Reynolds,et al.  Gaussian Mixture Models , 2018, Encyclopedia of Biometrics.

[39]  A Fort,et al.  Acoustic analysis of newborn infant cry signals. , 1998, Medical engineering & physics.

[40]  R. Cohen,et al.  Infant cry analysis and detection , 2012, 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel.

[41]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[42]  P Venuti,et al.  ASSESSMENT OF DISTRESS IN YOUNG CHILDREN: A COMPARISON OF AUTISTIC DISORDER, DEVELOPMENTAL DELAY, AND TYPICAL DEVELOPMENT. , 2011, Research in autism spectrum disorders.

[43]  Udo Zölzer,et al.  COMPARISON OF PITCH TRACKERS FOR REAL-TIME GUITAR EFFECTS , 2010 .

[44]  Mark J. F. Gales,et al.  The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..

[45]  Leonardo Bocchi,et al.  Central blood oxygen saturation vs crying in preterm newborns , 2012, Biomed. Signal Process. Control..