Toward Mining Capricious Data Streams: A Generative Approach

Learning with streaming data has received extensive attention during the past few years. Existing approaches assume that the feature space is fixed or changes by following explicit regularities, limiting their applicability in real-time applications. For example, in a smart healthcare platform, the feature space of the patient data varies when different medical service providers use nonidentical feature sets to describe the patients' symptoms. To fill the gap, we in this article propose a novel learning paradigm, namely, Generative Learning With Streaming Capricious (GLSC) data, which does not make any assumption on the feature space dynamics. In other words, GLSC handles the data streams with a varying feature space, where each arriving data instance can arbitrarily carry new features and/or stop carrying partial old features. Specifically, GLSC trains a learner on a universal feature space that establishes relationships between old and new features, so that the patterns learned in the old feature space can be used in the new feature space. The universal feature space is constructed by leveraging the relatednesses among features. We propose a generative graphical model to model the construction process, and show that learning from the universal feature space can effectively improve the performance with theoretical guarantees. The experimental results demonstrate that GLSC achieves conspicuous performance on both synthetic and real data sets.

[1]  Mohamed S. Kamel,et al.  An Efficient Greedy Method for Unsupervised Feature Selection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[2]  Xianchao Zhang,et al.  Self-Adapted Multi-Task Clustering , 2016, IJCAI.

[3]  Zenglin Xu,et al.  Variational Random Function Model for Network Modeling , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Christos Boutsidis,et al.  Unsupervised feature selection for principal components analysis , 2008, KDD.

[5]  Eric Eaton,et al.  Online Multi-Task Learning via Sparse Dictionary Optimization , 2014, AAAI.

[6]  Xindong Wu,et al.  Online Learning from Trapezoidal Data Streams , 2016, IEEE Transactions on Knowledge and Data Engineering.

[7]  Tom Heskes,et al.  Bayesian Source Localization with the Multivariate Laplace Prior , 2009, NIPS.

[8]  João Gama,et al.  An Overview on Mining Data Streams , 2009, Foundations of Computational Intelligence.

[9]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[10]  Trung Le,et al.  Large-scale Online Kernel Learning with Random Feature Reparameterization , 2017, IJCAI.

[11]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[12]  Reinhold Haux,et al.  Health information systems - past, present, future , 2006, Int. J. Medical Informatics.

[13]  Allen Gersho,et al.  Variable-dimension vector quantization , 1996, IEEE Signal Process. Lett..

[14]  Wei Zhang,et al.  Integrating Semantic Relatedness and Words' Intrinsic Features for Keyword Extraction , 2013, IJCAI.

[15]  Yueting Zhuang,et al.  Graph Regularized Feature Selection with Data Reconstruction , 2016, IEEE Transactions on Knowledge and Data Engineering.

[16]  Lei Wang,et al.  Encoding High Dimensional Local Features by Sparse Coding Based Fisher Vectors , 2014, NIPS.

[17]  Huan Liu,et al.  Reconstruction-based Unsupervised Feature Selection: An Embedded Approach , 2017, IJCAI.

[18]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[20]  Jieping Ye,et al.  Sparse methods for biomedical data , 2012, SKDD.

[21]  Wei Xiang,et al.  Internet of Things for Smart Healthcare: Technologies, Challenges, and Opportunities , 2017, IEEE Access.

[22]  Zhi-Hua Zhou,et al.  Learning With Feature Evolvable Streams , 2017, IEEE Transactions on Knowledge and Data Engineering.

[23]  Francisco Herrera,et al.  Analyzing the presence of noise in multi-class problems: alleviating its influence with the One-vs-One decomposition , 2012, Knowledge and Information Systems.

[24]  Xuelong Li,et al.  Unsupervised Feature Selection via Adaptive Multimeasure Fusion , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Ling Shao,et al.  Learning Deep and Wide: A Spectral Method for Learning Deep Networks , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[27]  Lan Tang,et al.  Compressed sensing image reconstruction via adaptive sparse nonlocal regularization , 2016, The Visual Computer.

[28]  Ling Shao,et al.  Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Lin Xiao,et al.  On the complexity analysis of randomized block-coordinate descent methods , 2013, Mathematical Programming.

[30]  Zhang Yi,et al.  Graph Regularized Restricted Boltzmann Machine , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Charu C. Aggarwal,et al.  Classification and Adaptive Novel Class Detection of Feature-Evolving Data Streams , 2013, IEEE Transactions on Knowledge and Data Engineering.

[32]  Dong-Hong Ji,et al.  Unsupervised Feature Selection for Relation Extraction , 2005, IJCNLP.

[33]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[34]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[35]  Ivor W. Tsang,et al.  Heterogeneous Domain Adaptation for Multiple Classes , 2014, AISTATS.

[36]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[37]  Xindong Wu,et al.  Online Learning from Data Streams with Varying Feature Spaces , 2019, AAAI.

[38]  Gang Niu,et al.  Active Feature Acquisition with Supervised Matrix Completion , 2018, KDD.

[39]  Ian A. Wood,et al.  Asymptotic Normality of the Maximum Pseudolikelihood Estimator for Fully Visible Boltzmann Machines , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[41]  Honglak Lee,et al.  Online Incremental Feature Learning with Denoising Autoencoders , 2012, AISTATS.

[42]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[43]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[44]  Te-Won Lee,et al.  Modeling Nonlinear Dependencies in Natural Images using Mixture of Laplacian Distribution , 2004, NIPS.

[45]  Jianhua Xu,et al.  An extended one-versus-rest support vector machine for multi-label classification , 2011, Neurocomputing.

[46]  Mirza Mansoor Baig,et al.  Smart Health Monitoring Systems: An Overview of Design and Modeling , 2013, Journal of Medical Systems.