Label free change detection on streaming data with cooperative multi-objective genetic programming

Classification under streaming data conditions requires that the machine learning (ML) approach operate interactively with the stream content. Thus, given some initial ML classification capability, it is not possible to assume that stream content will be stationary. It is therefore necessary to first detect when the stream content changes. Only after detecting a change, can classifier retraining be triggered. Current methods for change detection tend to assume an entropy filter approach, where class labels are necessary. In practice, labeling the stream would be extremely expensive. This work proposes an approach in which the behaviour of GP individuals is used to detect change without the use of labels. Only after detecting a change is label information requested. Benchmarking under a computer network traffic analysis scenario demonstrates that the proposed approach performs at least as well as the filter method, while retaining the advantage of requiring no labels.

[1]  Andrew R. McIntyre,et al.  Novelty detection + coevolution = automatic problem decomposition: a framework for scalable genetic programming classifiers , 2008 .

[2]  Charles Elkan,et al.  Results of the KDD'99 classifier learning , 2000, SKDD.

[3]  Anthony Brabazon,et al.  Foundations in Grammatical Evolution for Dynamic Environments , 2009, Studies in Computational Intelligence.

[4]  Xindong Wu,et al.  Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams , 2006, Data Mining and Knowledge Discovery.

[5]  ElkanCharles Results of the KDD'99 classifier learning , 2000 .

[6]  Gerhard Widmer,et al.  Effective Learning in Dynamic Environments by Explicit Context Tracking , 1993, ECML.

[7]  Abraham Bernstein,et al.  Entropy-based Concept Shift Detection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[8]  Xiaodong Lin,et al.  Active Learning From Stream Data Using Optimal Weight Classifier Ensemble , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Malcolm I. Heywood,et al.  GP under streaming data constraints: a case for pareto archiving? , 2012, GECCO '12.

[10]  Edwin D. de Jong,et al.  The Incremental Pareto-Coevolution Archive , 2004, GECCO.

[11]  Claude Sammut,et al.  Extracting Hidden Context , 1998, Machine Learning.

[12]  Charles F. Hockett,et al.  A mathematical theory of communication , 1948, MOCO.

[13]  M. Heywood,et al.  Classification as Clustering: A Pareto Cooperative-Competitive GP Approach , 2011, Evolutionary Computation.

[14]  Edwin D de Jong A monotonic archive for pareto-coevolution. , 2007, Evolutionary computation.

[15]  Riyad Alshammari,et al.  Machine learning based encrypted traffic classification: Identifying SSH and Skype , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[16]  Ronald W. Morrison,et al.  Designing Evolutionary Algorithms for Dynamic Environments , 2004, Natural Computing Series.

[17]  Markus Bohlin,et al.  Statistical Anomaly Detection for Train Fleets , 2012, AI Mag..

[18]  C. V. Ramamoorthy,et al.  Knowledge and Data Engineering , 1989, IEEE Trans. Knowl. Data Eng..

[19]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[20]  Ernestina Menasalvas Ruiz,et al.  Learning recurring concepts from data streams with a context-aware ensemble , 2011, SAC.

[21]  Rajeev Kumar,et al.  Improved Sampling of the Pareto-Front in Multiobjective Genetic Optimizations by Steady-State Evolution: A Pareto Converging Genetic Algorithm , 2002, Evolutionary Computation.

[22]  Philip M. Long,et al.  Tracking drifting concepts by minimizing disagreements , 2004, Machine Learning.

[23]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[24]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[25]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[26]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[27]  Rajeev Motwani,et al.  Sampling from a moving window over streaming data , 2002, SODA '02.

[28]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[29]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[30]  Malcolm I. Heywood,et al.  Training Binary GP Classifiers Efficiently: A Pareto-coevolutionary Approach , 2007, EuroGP.

[31]  A. Nur Zincir-Heywood,et al.  A Comparison of three machine learning techniques for encrypted network traffic analysis , 2011, 2011 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).

[32]  Malcolm I. Heywood,et al.  Benchmarking pareto archiving heuristics in the presence of concept drift: diversity versus age , 2013, GECCO '13.

[33]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[34]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[35]  Stefan Rüping,et al.  Concept Drift and the Importance of Example , 2003, Text Mining.

[36]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[37]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[38]  Daniel Arndt,et al.  An Investigation of Using Machine Learning with Distribution Based Flow Features for Classifying SSL Encrypted Network Traffic , 2012 .

[39]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[40]  O. N. Garcia,et al.  Knowledge and Data Engineering: An Outlook , 1989 .

[41]  Richard Granger,et al.  Incremental Learning from Noisy Data , 1986, Machine Learning.

[42]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[43]  Malcolm I. Heywood,et al.  An Investigation of Multi-objective Genetic Algorithms for Encrypted Traffic Identification , 2009, CISIS.

[44]  Marcos Salganicoff,et al.  Tolerating Concept and Sampling Shift in Lazy Learning Using Prediction Error Context Switching , 1997, Artificial Intelligence Review.

[45]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .