Transfer learning extensions for the probabilistic classification vector machine

Abstract Transfer learning is focused on the reuse of supervised learning models in a new context. Prominent applications can be found in robotics, image processing or web mining. In these fields, the learning scenarios are naturally changing but often remain related to each other motivating the reuse of existing supervised models. Current transfer learning models are neither sparse nor interpretable. Sparsity is very desirable if the methods have to be used in technically limited environments and interpretability is getting more critical due to privacy regulations. In this work, we propose two transfer learning extensions integrated into the sparse and interpretable probabilistic classification vector machine. They are compared to standard benchmarks in the field and show their relevance either by sparsity or performance improvements.

[1]  Ameet Talwalkar,et al.  Sampling Methods for the Nyström Method , 2012, J. Mach. Learn. Res..

[2]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[3]  Huanhuan Chen,et al.  Probabilistic Classification Vector Machines , 2009, IEEE Transactions on Neural Networks.

[4]  Huanhuan Chen,et al.  Efficient Probabilistic Classification Vector Machine With Incremental Basis Function Selection , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[6]  Ethem Alpaydın,et al.  Combined 5 x 2 cv F Test for Comparing Supervised Classification Learning Algorithms , 1999, Neural Comput..

[7]  John W. Fisher,et al.  Dreaming More Data: Class-dependent Distributions over Diffeomorphisms for Learned Data Augmentation , 2015, AISTATS.

[8]  Kristen Grauman,et al.  Geodesic Flow Kernel and Landmarks: Kernel Methods for Unsupervised Domain Adaptation , 2017, Domain Adaptation in Computer Vision Applications.

[9]  Johan A. K. Suykens,et al.  Optimized fixed-size kernel models for large data sets , 2010, Comput. Stat. Data Anal..

[10]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Qiang Yang,et al.  Co-clustering based classification for out-of-domain documents , 2007, KDD '07.

[13]  Philip S. Yu,et al.  Adaptation Regularization: A General Framework for Transfer Learning , 2014, IEEE Transactions on Knowledge and Data Engineering.

[14]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Michael W. Mahoney,et al.  Revisiting the Nystrom Method for Improved Large-scale Machine Learning , 2013, J. Mach. Learn. Res..

[17]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[19]  James T. Kwok,et al.  Clustered Nyström Method for Large Scale Manifold Learning and Dimension Reduction , 2010, IEEE Transactions on Neural Networks.

[20]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[21]  Frank-Michael Schleif,et al.  Sparse Transfer Classification for Text Documents , 2018, KI.

[22]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[23]  Peter Tiño,et al.  Supervised low rank indefinite kernel approximation using minimum enclosing balls , 2018, Neurocomputing.

[24]  Philip S. Yu,et al.  Domain Invariant Transfer Kernel Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[25]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[27]  Amir Averbuch,et al.  Matrix compression using the Nyström method , 2013, Intell. Data Anal..

[28]  Peter Tiño,et al.  Incremental probabilistic classification vector machine with linear costs , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).