Novel iterative approach using generative and discriminative models for classification with missing features

Missing feature is a common problem in real-world data classification. Therefore, a robust classification method is required when classifying data with missing features. In this study, we propose an iterative algorithm composed of a generative model that works in conjunction with a discriminative model in a cycle. The Gaussian mixture model (GMM) and the multilayer perceptron (MLP) (or the support vector machine (SVM)) present the generative and discriminative parts of the proposed algorithm, respectively. This study conducted two experiments using UC Irvine datasets. One is to show the superiority of the proposed method through its higher classification accuracy compared with previous classification methods including with respect to marginalization, mean imputation, conditional mean imputation, and zero-mean imputation. The other is to compare classification accuracy of the proposed method with that of conventional the state-of-the-art GMM-based approaches to the missing data problem.

[1]  Maciej Zieba,et al.  Service-Oriented Medical System for Supporting Decisions With Missing and Imbalanced Data , 2014, IEEE Journal of Biomedical and Health Informatics.

[2]  Shin-Ki Kim,et al.  A Supervised Feature-Projection-Based Real-Time EMG Pattern Recognition for Multifunction Myoelectric Hand Control , 2007, IEEE/ASME Transactions on Mechatronics.

[3]  Kuldip K. Paliwal,et al.  Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition , 2003, Pattern Recognit..

[4]  Huiling Chen,et al.  Imputing missing values in sensor networks using sparse data representations , 2014, MSWiM '14.

[5]  Xiaofeng Zhu,et al.  Missing data imputation by utilizing information within incomplete instances , 2011, J. Syst. Softw..

[6]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[7]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[8]  Liang Hu,et al.  Missing Value Imputation Based on Gaussian Mixture Model for the Internet of Things , 2015 .

[9]  Amaury Lendasse,et al.  X-SOM and L-SOM: A double classification approach for missing value imputation , 2010, Neurocomputing.

[10]  Francisco Herrera,et al.  A study on the use of imputation methods for experimentation with Radial Basis Function Network classifiers handling missing attribute values: The good synergy between RBFNs and EventCovering method , 2010, Neural Networks.

[11]  Aníbal R. Figueiras-Vidal,et al.  Pattern classification with missing data: a review , 2010, Neural Computing and Applications.

[12]  Geoffrey J. McLachlan,et al.  Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..

[13]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[14]  Rubiyah Yusof,et al.  FINNIM: Iterative Imputation of Missing Values in Dissolved Gas Analysis Dataset , 2014, IEEE Transactions on Industrial Informatics.

[15]  Leonardo Franco,et al.  Missing data imputation using statistical and machine learning methods in a real breast cancer problem , 2010, Artif. Intell. Medicine.

[16]  Quan Pan,et al.  Classification of incomplete data based on belief functions and K-nearest neighbors , 2015, Knowl. Based Syst..

[17]  Esteban J. Palomo,et al.  Application of growing hierarchical SOM for visualisation of network forensics traffic data , 2012, Neural Networks.

[18]  Quan Pan,et al.  Adaptive imputation of missing values for incomplete pattern classification , 2016, Pattern Recognit..

[19]  Michael E. Tipping,et al.  Mixtures of Principal Component Analysers , 1997 .

[20]  Abdulhamit Subasi,et al.  EEG signal classification using PCA, ICA, LDA and support vector machines , 2010, Expert Syst. Appl..

[21]  Ning Ma,et al.  MMSE-Based Missing-Feature Reconstruction With Temporal Modeling for Robust Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Quan Pan,et al.  A New Incomplete Pattern Classification Method Based on Evidential Reasoning , 2015, IEEE Transactions on Cybernetics.

[23]  Gerhard Tutz,et al.  Improved methods for the imputation of missing data by nearest neighbor methods , 2015, Comput. Stat. Data Anal..

[24]  Mohd Saberi Mohamad,et al.  A Review on Missing Value Imputation Algorithms for Microarray Gene Expression Data , 2014 .

[25]  Jianda Han,et al.  Missing-Data Classification With the Extended Full-Dimensional Gaussian Mixture Model: Applications to EMG-Based Motion Recognition , 2015, IEEE Transactions on Industrial Electronics.