I2VM: Incremental import vector machines

We introduce an innovative incremental learner called incremental import vector machines (I^2VM). The kernel-based discriminative approach is able to deal with complex data distributions. Additionally, the learner is sparse for an efficient training and testing and has a probabilistic output. We particularly investigate the reconstructive component of import vector machines, in order to use it for robust incremental learning. By performing incremental update steps, we are able to add and remove data samples, as well as update the current set of model parameters for incremental learning. By using various standard benchmarks, we demonstrate how I^2VM is competitive or superior to other incremental methods. It is also shown that our approach is capable of managing concept-drifts in the data distributions.

[1]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[3]  Carlo Tomasi,et al.  Efficient Visual Object Tracking with Online Nearest Neighbor Classifier , 2010, ACCV.

[4]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[5]  Georg Heigold,et al.  Latent Log-Linear Models for Handwritten Digit Classification , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Alexander K. Seewald Digits-A Dataset for Handwritten Digit Recognition , .

[7]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[8]  John A. Richards,et al.  Analysis of remotely sensed data: the formative decades and the future , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Antonio J. Plaza,et al.  Hyperspectral Image Segmentation Using a New Bayesian Approach With Active Learning , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[11]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[12]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[13]  Johan A. K. Suykens,et al.  Multi-class kernel logistic regression: a fixed-size implementation , 2007, IJCNN.

[14]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[15]  Shaoning Pang,et al.  Incremental linear discriminant analysis for classification of data streams , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  William J. Emery,et al.  Active Learning Methods for Remote Sensing Image Classification , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Tat-Jun Chin,et al.  Incremental Kernel Principal Component Analysis , 2007, IEEE Transactions on Image Processing.

[18]  Patrick Hostert,et al.  Land cover mapping of large areas using chain classification of neighboring Landsat satellite images , 2009 .

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[21]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[22]  Gavin C. Cawley,et al.  Efficient model selection for kernel logistic regression , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[23]  Bernt Schiele,et al.  Integrating representative and discriminant models for object category detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[24]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[25]  S. Sathiya Keerthi,et al.  A Fast Dual Algorithm for Kernel Logistic Regression , 2002, 2007 International Joint Conference on Neural Networks.

[26]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[27]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[28]  Horst Bischof,et al.  Why to Combine Reconstructive and Discriminative Information for Incremental Subspace Learning , 2006 .

[29]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[30]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[31]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[32]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[33]  Ribana Roscher,et al.  Incremental import vector machines for large area land cover classification , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[34]  Ales Leonardis,et al.  Online Discriminative Kernel Density Estimation , 2010, 2010 20th International Conference on Pattern Recognition.

[35]  David A. Cieslak,et al.  Evaluating Probability Estimates from Decision Trees , 2006 .

[36]  Zhihua Zhang,et al.  Bayesian Generalized Kernel Mixed Models , 2011, J. Mach. Learn. Res..

[37]  Christopher Joseph Pal,et al.  Multi-Conditional Learning: Generative/Discriminative Training for Clustering and Classification , 2006, AAAI.

[38]  Frédéric Achard,et al.  Pre-processing of a sample of multi-scene and multi-date Landsat imagery used to monitor forest cover changes over the tropics , 2011 .

[39]  Volker Roth,et al.  Probabilistic Discriminative Kernel Classifiers for Multi-class Problems , 2001, DAGM-Symposium.

[40]  Ribana Roscher,et al.  Incremental Import Vector Machines for Classifying Hyperspectral Data , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[41]  Josef Kittler,et al.  Incremental Linear Discriminant Analysis Using Sufficient Spanning Set Approximations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[43]  Glenn Fung,et al.  Incremental Support Vector Machine Classification , 2002, SDM.

[44]  Uwe Weidner,et al.  Support vector machines, import vector machines and relevance vector machines for hyperspectral classification — A comparison , 2011, 2011 3rd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS).

[45]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[46]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.

[47]  Lothar Hotz,et al.  The Role of Sequences for Incremental Learning , 2018, ICAART.

[48]  Jun Zheng,et al.  An Online Incremental Learning Support Vector Machine for Large-scale Data , 2010, ICANN.

[49]  Masataka Goto,et al.  An Efficient Hybrid Music Recommender System Using an Incrementally Trainable Probabilistic Generative Model , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[50]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[51]  Peter Sollich,et al.  Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities , 2002, Machine Learning.

[52]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  J. Cihlar Land cover mapping of large areas from satellites: Status and research priorities , 2000 .

[54]  Martial Hebert,et al.  Discriminative Random Fields , 2006, International Journal of Computer Vision.

[55]  Mei Han An,et al.  accuracy and stability of numerical algorithms , 1991 .

[56]  Stefan Rüping,et al.  Incremental Learning with Support Vector Machines , 2001, ICDM.

[57]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[58]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[59]  Fred W. Glover,et al.  Tabu Search - Part I , 1989, INFORMS J. Comput..

[60]  Nicole Jonker Logit Models. From Economics and Other Fields , 2005 .

[61]  Bianca Zadrozny,et al.  Learning and making decisions when costs and probabilities are both unknown , 2001, KDD '01.

[62]  Michael A. Wulder,et al.  Remote sensing methods in medium spatial resolution satellite data land cover classification of large areas , 2002 .

[63]  Jason Weston,et al.  Solving multiclass support vector machines with LaRank , 2007, ICML '07.

[64]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[65]  Thomas B. Moeslund,et al.  Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[66]  S. J. Moss,et al.  Performance of the NASA Laser Ranging System in Satellite Tracking , 1971 .

[67]  Josep Roure Alcobé,et al.  A Buffering Strategy to Avoid Ordering Effects in Clustering , 1998, ECML.

[68]  Ralf Klinkenberg,et al.  Boosting classifiers for drifting concepts , 2007, Intell. Data Anal..

[69]  Zhuowen Tu,et al.  Combining Generative and Discriminative Models for Semantic Segmentation of CT Scans via Active Learning , 2011, IPMI.

[70]  Zhuowen Tu,et al.  Learning Generative Models via Discriminative Approaches , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[72]  Hans Spada,et al.  Learning in Humans and Machines , 1995 .

[73]  Qi Zhao,et al.  Co-Tracking Using Semi-Supervised Support Vector Machines , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[74]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[75]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[76]  Marc G. Genton,et al.  Classes of Kernels for Machine Learning: A Statistics Perspective , 2002, J. Mach. Learn. Res..

[77]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[78]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[79]  Josef Kellndorfer,et al.  Large-Area Classification and Mapping of Forest and Land Cover in the Brazilian Amazon: A Comparative Analysis of ALOS/PALSAR and Landsat Data Sources , 2010, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[81]  Pat Langley,et al.  Constraints on Tree Structure in Concept Formation , 1991, IJCAI.

[82]  Horst Bischof,et al.  Incremental LDA Learning by Combining Reconstructive and Discriminative Approaches , 2007, BMVC.

[83]  Ji Zhu,et al.  Kernel Logistic Regression and the Import Vector Machine , 2001, NIPS.

[84]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[85]  Horst Bischof,et al.  On-line Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[86]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[87]  Rajat Raina,et al.  Classification with Hybrid Generative/Discriminative Models , 2003, NIPS.

[88]  Stefan Rüping,et al.  A Simple Method For Estimating Conditional Probabilities For SVMs , 2004, LWA.

[89]  Antoine Cornuéjols,et al.  Getting Order Independence in Incremental Learning , 1993, ECML.

[90]  Samy Bengio,et al.  A Probabilistic Interpretation of SVMs with an Application to Unbalanced Classification , 2005, NIPS.

[91]  Bernhard Schölkopf,et al.  Support Vector Machines as Probabilistic Models , 2011, ICML.

[92]  P. Strobl,et al.  Pan-European Forest/Non-Forest Mapping with Landsat ETM+ and CORINE Land Cover 2000 Data , 2009 .

[93]  D. M. Titterington,et al.  On the generative-discriminative tradeoff approach: Interpretation, asymptotic efficiency and classification performance , 2010, Comput. Stat. Data Anal..

[94]  Gavin C. Cawley,et al.  Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation , 2006, NIPS.

[95]  Ichiro Takeuchi,et al.  Multiple Incremental Decremental Learning of Support Vector Machines , 2009, IEEE Transactions on Neural Networks.

[96]  Silvia Scarpetta,et al.  Uncertainty Analysis for the Classification of Multispectral Satellite Images Using SVMs and SOMs , 2010, IEEE Transactions on Geoscience and Remote Sensing.