Active Learning With Multiple Kernels

Online multiple kernel learning (OMKL) has provided an attractive performance in nonlinear function learning tasks. Leveraging a random feature (RF) approximation, the major drawback of OMKL, known as the curse of dimensionality, has been recently alleviated. These advantages enable RF-based OMKL to be considered in practice. In this article, we introduce a new research problem, named stream-based active MKL (AMKL), in which a learner is allowed to label some selected data from an oracle according to a selection criterion. This is necessary for many real-world applications as acquiring a true label is costly or time consuming. We theoretically prove that the proposed AMKL achieves an optimal sublinear regret O(√T) as in OMKL with little labeled data, implying that the proposed selection criterion indeed avoids unnecessary label requests. Furthermore, we present AMKL with an adaptive kernel selection (named AMKL-AKS) in which irrelevant kernels can be excluded from a kernel dictionary ``on the fly.'' This approach improves the efficiency of active learning and the accuracy of function learning. Via numerical tests with real data sets, we verify the superiority of AMKL-AKS, yielding a similar accuracy performance with OMKL counterpart using a fewer number of labeled data.

[1]  Bin Li,et al.  Online multiple kernel regression , 2014, KDD.

[2]  Mark Craven,et al.  Active Learning with Real Annotation Costs , 2008 .

[3]  Chunyan Miao,et al.  Online Active Learning with Expert Advice , 2018, ACM Trans. Knowl. Discov. Data.

[4]  Hwanjo Yu,et al.  SVM selective sampling for ranking with application to data retrieval , 2005, KDD '05.

[5]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[6]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[7]  Georgios B. Giannakis,et al.  Random Feature-based Online Multi-kernel Learning in Environments with Unknown Dynamics , 2017, J. Mach. Learn. Res..

[8]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[9]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[10]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[11]  Luis M. Candanedo,et al.  Data driven prediction models of energy use of appliances in a low-energy house , 2017 .

[12]  Vikram Krishnamurthy,et al.  Algorithms for optimal scheduling and management of hidden Markov model sensors , 2002, IEEE Trans. Signal Process..

[13]  Davide Anguita,et al.  Machine learning approaches for improving condition-based maintenance of naval propulsion plants , 2016 .

[14]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[15]  Georgios B. Giannakis,et al.  Nonparametric Basis Pursuit via Sparse Kernel-Based Learning: A Unifying View with Advances in Blind Methods , 2013, IEEE Signal Processing Magazine.

[16]  François Kawala,et al.  Prédictions d'activité dans les réseaux sociaux en ligne , 2013 .

[17]  Nada Lavrac,et al.  Stream-based active learning for sentiment analysis in the financial domain , 2014, Inf. Sci..

[18]  Dongrui Wu,et al.  Pool-Based Sequential Active Learning for Regression , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[20]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[21]  Martin J. Wainwright,et al.  High-Dimensional Statistics , 2019 .

[22]  Sébastien Bubeck,et al.  Introduction to Online Optimization , 2011 .

[23]  G. Wahba Spline models for observational data , 1990 .

[24]  Paul Honeine,et al.  Online Prediction of Time Series Data With Kernels , 2009, IEEE Transactions on Signal Processing.

[25]  Shinichi Nakajima,et al.  Pool-based active learning in approximate linear regression , 2009, Machine Learning.

[26]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[27]  Lawrence K. Saul,et al.  Identifying suspicious URLs: an application of large-scale online learning , 2009, ICML '09.

[28]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[29]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[30]  Chiou-Shann Fuh,et al.  Multiple Kernel Learning for Dimensionality Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Geoff Holmes,et al.  Active Learning With Drifting Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Paul Honeine,et al.  Online Prediction of Time Series Data With Kernels , 2009, IEEE Trans. Signal Process..

[33]  Mehryar Mohri,et al.  L2 Regularization for Learning Kernels , 2009, UAI.

[34]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[35]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[36]  E. Massera,et al.  On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario , 2008 .

[37]  Le Song,et al.  Learning from Conditional Distributions via Dual Embeddings , 2016, AISTATS.