Efficient Online Model Adaptation by Incremental Simplex Tableau

Online multi-kernel learning is promising in the era of mobile computing, in which a combined classifier with multiple kernels are offline trained, and online adapts to personalized features for serving the end user precisely and smartly. The online adaptation is mainly carried out at the end-devices, which requires the adaptation algorithms to be light, efficient and accurate. Previous results focused mainly on efficiency. This paper proposes an novel online model adaptation framework for not only efficiency but also speedy and online adaptation optimality. At first, an online optimal incremental simplex tableau (IST) algorithm is proposed, which approaches the model adaption by linear programming and produces the optimized model update in each step when a personalized training data is collected. But keeping online optimal in each step is expensive and may cause overfitting especially when the online data is noisy. A Fast-IST approach is therefore proposed, which measures the derivation between the training data and the current model. It schedules updating only when enough derivation is detected. The efficiency of each update is further enhanced by running IST only limited iterations, which bounds the computation complexity. Theoretical analysis and extensive evaluations show that Fast-IST saves computation cost greatly, while achieving speedy and accurate model adaptation. It provides much better model adaptation speed and accuracy while using even lower computing cost than the stateof-the-art.

[1]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[2]  Euiho Suh,et al.  Context-aware systems: A literature review and classification , 2009, Expert Syst. Appl..

[3]  Qi Fan,et al.  Structural multiple empirical kernel learning , 2015, Inf. Sci..

[5]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[6]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[7]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[8]  L. Penrose The Elementary Statistics of Majority Voting , 1946 .

[9]  Francesco Orabona,et al.  OM-2: An online multi-class Multi-Kernel Learning algorithm Luo Jie , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[10]  Tommy W. S. Chow,et al.  Effective feature selection scheme using mutual information , 2005, Neurocomputing.

[11]  Yongcai Wang,et al.  Health sensing by wearable sensors and mobile phones: A survey , 2014, 2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom).

[12]  Nello Cristianini,et al.  A statistical framework for genomic data fusion , 2004, Bioinform..

[13]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[14]  Larry S. Davis,et al.  Incremental Multiple Kernel Learning for object recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Leif E. Peterson K-nearest neighbor , 2009, Scholarpedia.

[16]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[17]  L. Eon Bottou Online Learning and Stochastic Approximations , 1998 .

[18]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[19]  Rong Jin,et al.  Online Multiple Kernel Learning: Algorithms and Mistake Bounds , 2010, ALT.

[20]  Yoav Freund,et al.  A Parameter-free Hedging Algorithm , 2009, NIPS.

[21]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[22]  Dimitrios I. Fotiadis,et al.  Multiple Kernel Learning Algorithms and Their Use in Biomedical Informatics , 2016 .

[23]  Rong Jin,et al.  Online Multiple Kernel Classification , 2013, Machine Learning.

[24]  Ching Y. Suen,et al.  Application of majority voting to pattern recognition: an analysis of its behavior and performance , 1997, IEEE Trans. Syst. Man Cybern. Part A.

[25]  Yongcai Wang,et al.  On Precision Bound of Distributed Fault-Tolerant Sensor Fusion Algorithms , 2016, ACM Comput. Surv..

[26]  Thomas J. Watson,et al.  An empirical study of the naive Bayes classifier , 2001 .

[27]  Yoram Baram,et al.  Learning by Kernel Polarization , 2005, Neural Computation.

[28]  Eric P. Xing,et al.  Online Learning of Structured Predictors with Multiple Kernels , 2011, AISTATS.