A Multi-Representation Ensemble Approach to Classifying Vocal Diseases

The goal of the IEEE 2018 FEMH Voice Data Challenge was to develop an effective algorithmic approach to classifying voice samples as normal or pathological, and further subdivide the pathological samples into three types. We adopted a multi-representation ensemble approach to the task. We designed a pipeline with three classification stages, where each stage used a combination of supervised, semi-supervised and multiple-instance learners. This approach was able to achieve a sensitivity of 89% and specificity of 76% in classifying normal from pathological samples and an unweighted average recall (UAR) of 60.67% in subclassifying pathological samples into three types.

[1]  J. W. Gordon,et al.  Perceptual effects of spectral modifications on musical timbres , 1978 .

[2]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[3]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[4]  Oliver Kramer,et al.  Fast and simple gradient-based optimization for semi-supervised support vector machines , 2014, Neurocomputing.

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  Theodoros Giannakopoulos pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis , 2015, PloS one.

[7]  François Pachet,et al.  ON THE USE OF ZERO-CROSSING RATE FOR AN APPLICATION OF CLASSIFICATION OF PERCUSSIVE SOUNDS , 2000 .

[8]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[9]  Vahid Majidnezhad,et al.  An ANN-based Method for Detecting Vocal Fold Pathology , 2013, ArXiv.

[10]  Yehuda Koren,et al.  The BellKor Solution to the Netflix Grand Prize , 2009 .

[11]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[12]  Michael Möser,et al.  Handbook of Engineering Acoustics , 2013 .

[13]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[14]  Shih-Hau Fang,et al.  Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach. , 2019, Journal of voice : official journal of the Voice Foundation.

[15]  Gilles Louppe,et al.  Independent consultant , 2013 .

[16]  Barbara Mayer,et al.  Handbook Of Engineering Acoustics , 2016 .

[17]  Igor E. Kheidorov,et al.  Vocal fold pathology detection using modified wavelet-like features and support vector machines , 2007, 2007 15th European Signal Processing Conference.