Generalized RBF kernel for incomplete data

We construct $\bf genRBF$ kernel, which generalizes the classical Gaussian RBF kernel to the case of incomplete data. We model the uncertainty contained in missing attributes making use of data distribution and associate every point with a conditional probability density function. This allows to embed incomplete data into the function space and to define a kernel between two missing data points based on scalar product in $L_2$. Experiments show that introduced kernel applied to SVM classifier gives better results than other state-of-the-art methods, especially in the case when large number of features is missing. Moreover, it is easy to implement and can be used together with any kernel approaches with no additional modifications.

[1]  David Grangier,et al.  Feature Set Embedding for Incomplete Data , 2010, NIPS.

[2]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[3]  Johan A. K. Suykens,et al.  Handling missing values in support vector machine classifiers , 2005, Neural Networks.

[4]  Roi Livni,et al.  Classification with Low Rank and Missing Data , 2015, ICML.

[5]  Lora E. Burke,et al.  Compliance with cardiovascular disease prevention strategies: A review of the research , 1997, Annals of behavioral medicine : a publication of the Society of Behavioral Medicine.

[6]  Peter Haider,et al.  Learning from incomplete data with infinite imputations , 2008, ICML '08.

[7]  Thomas Hofmann,et al.  Kernel Methods for Missing Variables , 2005, AISTATS.

[8]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[9]  Pieter Abbeel,et al.  Max-margin Classification of Data with Absent Features , 2008, J. Mach. Learn. Res..

[10]  Michael I. Jordan,et al.  Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[11]  Hui Li,et al.  Quadratically gated mixture of experts for incomplete data classification , 2007, ICML '07.

[12]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[13]  Alexander J. Smola,et al.  Second Order Cone Programming Approaches for Handling Missing and Uncertain Data , 2006, J. Mach. Learn. Res..

[14]  Li Li,et al.  Adjusted weight voting algorithm for random forests in handling missing values , 2017, Pattern Recognit..

[15]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[16]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[17]  Patrick E. McKnight Missing Data: A Gentle Introduction , 2007 .

[18]  Rama Chellappa,et al.  Partial face detection for continuous authentication , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[19]  Joachim M. Buhmann,et al.  The Balanced Accuracy and Its Posterior Distribution , 2010, 2010 20th International Conference on Pattern Recognition.

[20]  Calyampudi Radhakrishna Rao,et al.  Linear Statistical Inference and its Applications , 1967 .

[21]  Ming Dong,et al.  Selection-fusion approach for classification of datasets with missing values , 2010, Pattern Recognit..

[22]  Lukasz A. Kurgan,et al.  Impact of imputation of missing values on classification error for discrete data , 2008, Pattern Recognit..

[23]  L. Carin,et al.  Analytical Kernel Matrix Completion with Incomplete Multi-View Data , 2005 .

[24]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[25]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Quan Pan,et al.  Adaptive imputation of missing values for incomplete pattern classification , 2016, Pattern Recognit..

[27]  Ami Wiesel,et al.  Multivariate Generalized Gaussian Distribution: Convexity and Graphical Models , 2013, IEEE Transactions on Signal Processing.

[28]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[29]  Jürgen Bajorath,et al.  Virtual screening methods that complement HTS. , 2004, Combinatorial chemistry & high throughput screening.

[30]  Jacek Tabor,et al.  Multithreshold Entropy Linear Classifier: Theory and applications , 2015, Expert Syst. Appl..

[31]  Ohad Shamir,et al.  Learning to classify with missing and corrupted features , 2008, ICML.

[32]  Robert D. Nowak,et al.  Transduction with Matrix Completion: Three Birds with One Stone , 2010, NIPS.

[33]  Constantine Frangakis,et al.  Multiple imputation by chained equations: what is it and how does it work? , 2011, International journal of methods in psychiatric research.

[34]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[35]  Lawrence Carin,et al.  Incomplete-data classification using logistic regression , 2005, ICML.