Deep Learning Approach to Parkinson’s Disease Detection Using Voice Recordings and Convolutional Neural Network Dedicated to Image Classification

This study presents an approach to Parkinson’s disease detection using vowels with sustained phonation and a ResNet architecture dedicated originally to image classification. We calculated spectrum of the audio recordings and used them as an image input to the ResNet architecture pre-trained using the ImageNet and SVD databases. To prevent overfitting the dataset was strongly augmented in the time domain. The Parkinson’s dataset (from PC-GITA database) consists of 100 patients (50 were healthy / 50 were diagnosed with Parkinson’s disease). Each patient was recorded 3 times. The obtained accuracy on the validation set is above 90% which is comparable to the current state-of-the-art methods. The results are promising because it turned out that features learned on natural images are able to transfer the knowledge to artificial images representing the spectrogram of the voice signal. What is more, we showed that it is possible to perform a successful detection of Parkinson’s disease using only frequency-based features. A spectrogram enables visual representation of frequencies spectrum of a signal. It allows to follow the frequencies changes of a signal in time.

[1]  J. Logemann,et al.  Frequency and cooccurrence of vocal tract dysfunctions in the speech of a large sample of Parkinson patients. , 1978, The Journal of speech and hearing disorders.

[2]  J. Jankovic,et al.  Movement Disorder Society‐sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS‐UPDRS): Scale presentation and clinimetric testing results , 2008, Movement disorders : official journal of the Movement Disorder Society.

[3]  Comparison of classification methods to detect the Parkinson disease , 2016, 2016 International Conference on Electrical and Information Technologies (ICEIT).

[4]  Max A. Little,et al.  Novel Speech Signal Processing Algorithms for High-Accuracy Classification of Parkinson's Disease , 2012, IEEE Transactions on Biomedical Engineering.

[5]  L. Ramig,et al.  Speech disorders in Parkinson's disease and the effects of pharmacological, surgical and speech treatment with emphasis on Lee Silverman voice treatment (LSVT(R)). , 2007, Handbook of clinical neurology.

[6]  Indrajit Mandal,et al.  Accurate telemonitoring of Parkinson's disease diagnosis using robust inference system , 2013, Int. J. Medical Informatics.

[7]  Jesús Francisco Vargas-Bonilla,et al.  New Spanish speech corpus database for the analysis of people suffering from Parkinson’s disease , 2014, LREC.

[8]  Arif Gülten,et al.  Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms , 2011, Comput. Methods Programs Biomed..

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Jesús Francisco Vargas-Bonilla,et al.  New Cues in Low-Frequency of Speech for Automatic Detection of Parkinson's Disease , 2013, IWINAC.

[11]  Jirí Mekyska,et al.  Voice Pathology Detection Using Deep Learning: a Preliminary Study , 2017, 2017 International Conference and Workshop on Bioinspired Intelligence (IWOBI).

[12]  Mehmet Can Neural Networks to Diagnose the Parkinson’s Disease , 2013, SOCO 2013.

[13]  Musaed Alhussein,et al.  Voice Pathology Detection Using Deep Learning on Mobile Healthcare Framework , 2018, IEEE Access.

[14]  Jesús Francisco Vargas-Bonilla,et al.  Voiced/unvoiced transitions in speech as a potential bio-marker to detect parkinson's disease , 2015, INTERSPEECH.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  P. James An Essay on the Shaking Palsy , 1817, The Medico-Chirurgical Journal and Review.

[17]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[18]  L. Ramig,et al.  Speech treatment for Parkinson's disease. , 2005, NeuroRehabilitation.

[19]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.