DETECTION OF ASPHYXIA IN INFANTS USING DEEP LEARNING CONVOLUTIONAL NEURAL NETWORK (CNN) TRAINED ON MEL FREQUENCY CEPSTRUM COEFFICIENT (MFCC) FEATURES EXTRACTED FROM CRY SOUNDS

Deep Learning Neural Network (DLNN), is a new branch of machine learning with the ability for complex feature representation compared to traditional 4th-generation neural networks. Although it was mainly suited for image feature (since it was inspired by object recognition method of mammalian visual system), if any type of feature can be translate into image, other type of data could be fit for using DLNN. In this paper, we prove that Mel Frequency Cepstrum Coefficient (MFCC) feature generates from audio signal of infant cry could be used as input feature for the Convolution Neural Network (CNN). The result shows CNN can be used to classify between normal and pathological (asphyxiated) cry with 94.3% accuracy in training set and 92.8% accuracy in testing set.

[1]  Misha Denil,et al.  Noisy Activation Functions , 2016, ICML.

[2]  F. Beritelli,et al.  A pattern recognition system for environmental sound classification based on MFCCs and neural networks , 2008, 2008 2nd International Conference on Signal Processing and Communication Systems.

[3]  Abdelmalek Toumi,et al.  Deep Learning for target recognition from SAR images , 2017, 2017 Seminar on Detection Systems Architectures and Technologies (DAT).

[4]  Xiaogang Wang,et al.  Object Detection from Video Tubelets with Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Zairi Ismael Rizman,et al.  Binary Particle Swarm Optimization Structure Selection of Nonlinear Autoregressive Moving Average with Exogenous Inputs (NARMAX) Model of a Flexible Robot Arm , 2016 .

[6]  Chuan-Yu Chang,et al.  Application of deep learning for recognizing infant cries , 2016, 2016 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW).

[7]  Hyung-Jeong Yang,et al.  Multimodal learning using convolution neural network and Sparse Autoencoder , 2017, 2017 IEEE International Conference on Big Data and Smart Computing (BigComp).

[8]  K Fukushima,et al.  Handwritten alphanumeric character recognition by the neocognitron , 1991, IEEE Trans. Neural Networks.

[9]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[10]  A. Dalvadi Jigar,et al.  Swift single image super resolution using deep convolution neural network , 2016, 2016 International Conference on Communication and Electronics Systems (ICCES).

[11]  Yan Liu,et al.  Normal / abnormal heart sound recordings classification using convolutional neural network , 2016, 2016 Computing in Cardiology Conference (CinC).

[12]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[13]  Zairi Ismael Rizman,et al.  The Performance of Binary Artificial Bee Colony (BABC) in Structure Selection of Polynomial NARX and NARMAX Models , 2017 .

[14]  Huang Yi,et al.  A study on Deep Neural Networks framework , 2016, 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC).

[15]  Rami Cohen,et al.  Baby cry detection in domestic environment using deep learning , 2016, 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE).

[16]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[17]  Jakob Verbeek,et al.  Convolutional Neural Fabrics , 2016, NIPS.

[18]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[19]  Montri Karnjanadecha,et al.  MODIFIED MEL-FREQUENCY CEPSTRUM COEFFICIENT , 2003 .

[20]  Nithin D. Kanishka,et al.  Learning of Generic Vision Features Using Deep CNN , 2015, 2015 Fifth International Conference on Advances in Computing and Communications (ICACC).

[21]  Zairi Ismael Rizman,et al.  Comparison between Cascade Forward and Multi-Layer Perceptron Neural Networks for NARX Functional Electrical Stimulation (FES)-Based Muscle Model , 2017 .