Online Learning from Capricious Data Streams: A Generative Approach

Learning with streaming data has received extensive attention during the past few years. Existing approaches assume the feature space is fixed or changes by following explicit regularities, limiting their applicability in dynamic environments where the data streams are described by an arbitrarily varying feature space. To handle such capricious data streams, we in this paper develop a novel algorithm, named OCDS (Online learning from Capricious Data Streams), which does not make any assumption on feature space dynamics. OCDS trains a learner on a universal feature space that establishes relationships between old and new features, so that the patterns learned in the old feature space can be used in the new feature space. Specifically, the universal feature space is constructed by leveraging the relatednesses among features. We propose a generative graphical model to model the construction process, and show that learning from the universal feature space can effectively improve performance with theoretical analysis. The experimental results demonstrate that OCDS achieves conspicuous performance on synthetic and real datasets.

[1]  Zhi-Hua Zhou,et al.  Learning With Feature Evolvable Streams , 2017, IEEE Transactions on Knowledge and Data Engineering.

[2]  R. Hecht-Nielsen Neurocomputing , 2020, Issue 4.

[3]  Xindong Wu,et al.  Online Learning from Data Streams with Varying Feature Spaces , 2019, AAAI.

[4]  Xindong Wu,et al.  Learning Simplified Decision Boundaries from Trapezoidal Data Streams , 2018, ICANN.

[5]  Gang Niu,et al.  Active Feature Acquisition with Supervised Matrix Completion , 2018, KDD.

[6]  Huan Liu,et al.  Reconstruction-based Unsupervised Feature Selection: An Embedded Approach , 2017, IJCAI.

[7]  Trung Le,et al.  Large-scale Online Kernel Learning with Random Feature Reparameterization , 2017, IJCAI.

[8]  Xindong Wu,et al.  Online Learning from Trapezoidal Data Streams , 2016, IEEE Transactions on Knowledge and Data Engineering.

[9]  Wei Zhang,et al.  Integrating Semantic Relatedness and Words' Intrinsic Features for Keyword Extraction , 2013, IJCAI.

[10]  Charu C. Aggarwal,et al.  Classification and Adaptive Novel Class Detection of Feature-Evolving Data Streams , 2013, IEEE Transactions on Knowledge and Data Engineering.

[11]  Mirza Mansoor Baig,et al.  Smart Health Monitoring Systems: An Overview of Design and Modeling , 2013, Journal of Medical Systems.

[12]  Honglak Lee,et al.  Online Incremental Feature Learning with Denoising Autoencoders , 2012, AISTATS.

[13]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[14]  Francisco Herrera,et al.  Analyzing the presence of noise in multi-class problems: alleviating its influence with the One-vs-One decomposition , 2012, Knowledge and Information Systems.

[15]  Jianhua Xu,et al.  An extended one-versus-rest support vector machine for multi-label classification , 2011, Neurocomputing.

[16]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[17]  Tom Heskes,et al.  Bayesian Source Localization with the Multivariate Laplace Prior , 2009, NIPS.

[18]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[19]  Dong-Hong Ji,et al.  Unsupervised Feature Selection for Relation Extraction , 2005, IJCNLP.

[20]  Te-Won Lee,et al.  Modeling Nonlinear Dependencies in Natural Images using Mixture of Laplacian Distribution , 2004, NIPS.

[21]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[22]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[23]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[24]  Michael J. Todd,et al.  Mathematical programming , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[25]  Benjamin W. Wah,et al.  Editorial: Two Named to Editorial Board of IEEE Transactions on Knowledge and Data Engineering , 1996 .

[26]  Peter Secretan Learning , 1965, Mental Health.