Feature ranking in hoeffding algorithms for regression

Feature selection and feature ranking are two aspects of the same learning task. They are well studied in batch scenarios, but not in the streaming setting. This paper presents a study on feature ranking from data streams in online learning regression models. The main challenge here is the relevance of features might change over time: features relevant in the past might be irrelevant now and vice-versa. We propose three new online feature ranking algorithms designed for Hoeffding algorithms. We have implemented the three methods in AMRules, a streaming regression algorithm to learn model rules. We compare their behaviour experimentally and present the pros and cons of each method.

[1]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[2]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[3]  João Gama,et al.  Adaptive Model Rules From High-Speed Data Streams , 2014, BigMine.

[4]  Michael R. Lyu,et al.  Efficient online learning for multitask feature selection , 2013, TKDD.

[5]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[6]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[7]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[8]  João Gama,et al.  Learning Decision Rules from Data Streams , 2011, IJCAI.

[9]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[10]  Hao Huang,et al.  Unsupervised Feature Selection on Data Streams , 2015, CIKM.

[11]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[14]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[15]  Grigorios Tsoumakas,et al.  On the Utility of Incremental Feature Selection for the Classification of Textual Data Streams , 2005, Panhellenic Conference on Informatics.

[16]  Saso Dzeroski,et al.  Learning model trees from evolving data streams , 2010, Data Mining and Knowledge Discovery.

[17]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[18]  Jean Paul Barddal,et al.  On Dynamic Feature Weighting for Feature Drifting Data Streams , 2016, ECML/PKDD.