Storage Fit Learning with Feature Evolvable Streams

Feature evolvable learning has been widely studied in recent years where old features will vanish and new features will emerge when learning with streams. Conventional methods usually assume that a label will be revealed after prediction at each time step. However, in practice, this assumption may not hold whereas no label will be given at most time steps. To tackle this problem, we leverage the technique of manifold regularization to utilize the previous similar data to assist the refinement of the online model. Nevertheless, this approach needs to store all previous data which is impossible in learning with streams that arrive sequentially in large volume. Thus we need a buffer to store part of them. Considering that different devices may have different storage budgets, the learning approaches should be flexible subject to the storage budget limit. In this paper, we propose a new setting: Storage-Fit Feature-Evolvable streaming Learning (SF2EL) which incorporates the issue of rarely-provided labels into feature evolution. Our framework is able to fit its behavior to different storage budgets when learning with feature evolvable streams with unlabeled data. Besides, both theoretical and empirical results validate that our approach can preserve the merit of the original feature evolvable learning i.e., can always track the best baseline and thus perform well at any time step.

[1]  SunShiliang,et al.  A survey of multi-source domain adaptation , 2015 .

[2]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3]  Chenping Hou,et al.  One-Pass Learning with Incremental and Decremental Features , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Xiaoyang Sean Wang,et al.  Adaptive-Size Reservoir Sampling over Data Streams , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).

[5]  Lijun Zhang,et al.  Prediction With Unpredictable Feature Evolution , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Ming Li,et al.  Online Manifold Regularization: A New Learning Setting and Empirical Study , 2008, ECML/PKDD.

[7]  Shiliang Sun,et al.  A survey of multi-source domain adaptation , 2015, Inf. Fusion.

[8]  Bin Li,et al.  Online Transfer Learning , 2014, Artif. Intell..

[9]  Zhi-Hua Zhou,et al.  Storage Fit Learning with Unlabeled Data , 2017, IJCAI.

[10]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[11]  Xindong Wu,et al.  Online Learning from Trapezoidal Data Streams , 2016, IEEE Transactions on Knowledge and Data Engineering.

[12]  Xindong Wu,et al.  Online Learning from Data Streams with Varying Feature Spaces , 2019, AAAI.

[13]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[14]  James J. Jiang A Literature Survey on Domain Adaptation of Statistical Classifiers , 2007 .

[15]  Pascal Vincent,et al.  Kernel Matching Pursuit , 2002, Machine Learning.

[16]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[17]  Zhi-Hua Zhou,et al.  Learning With Feature Evolvable Streams , 2017, IEEE Transactions on Knowledge and Data Engineering.

[18]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[19]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[20]  Yi He,et al.  Online Learning from Capricious Data Streams: A Generative Approach , 2019, IJCAI.