Creating ensemble classifiers through order and incremental data selection in a stream

This paper presents an original time-sensitive traffic management application for road safety diagnosis in signalized intersections. Such applications require to deal with data streams that may be subject to concept drift over various time scales. The method for road safety analysis relies on the estimation of severity indicators for vehicle interactions based on complex and noisy spatial occupancy information. An expert provides imprecise labels based on video recordings of the traffic scenes. In order to improve the performance—overall and for each class—and the stability of learning in a stream, this paper presents new ensemble methods based on incremental algorithms that rely on their sensitivity to the processing order of instances. Different data selection criteria, many used in active learning methods, are studied in a comprehensive experimental evaluation, including benchmark datasets from the UCI machine learning repository and the prediction of severity indicators. The best performance is obtained with a criterion that selects instances which are misclassified by the current hypothesis. The proposed ensemble methods using this criterion and AdaBoost have similar principles and performance, while the proposed methods have a smaller computational training cost.

[1]  Ian Witten,et al.  Data Mining , 2000 .

[2]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[3]  Herbert K. H. Lee,et al.  Lossless Online Bayesian Bagging , 2004, J. Mach. Learn. Res..

[4]  Johannes Fürnkranz,et al.  Integrative Windowing , 1998, J. Artif. Intell. Res..

[5]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[6]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[7]  Nicolas Saunier,et al.  Automatic Estimation of the Exposure to Lateral Collision in Signalized Intersections using Video Sensors , 2010, ArXiv.

[8]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[9]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[10]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[11]  Sophie Midenet,et al.  The real-time urban traffic control system CRONOS: Algorithm and experiments , 2006 .

[12]  Tarek Sayed,et al.  Large-Scale Automated Analysis of Vehicle Interactions and Collisions , 2010 .

[13]  Josef Kittler,et al.  Combining classifiers: A theoretical framework , 1998, Pattern Analysis and Applications.

[14]  Daphne Koller,et al.  Active learning: theory and applications , 2001 .

[15]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[16]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[17]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[18]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[19]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[20]  Avrim Blum,et al.  On-line Algorithms in Machine Learning , 1996, Online Algorithms.

[21]  Alessandro Giua,et al.  Guest Editorial , 2001, Discrete event dynamic systems.

[22]  Atsushi Imiya,et al.  Structural, Syntactic, and Statistical Pattern Recognition , 2012, Lecture Notes in Computer Science.

[23]  Structural, Syntactic, and Statistical Pattern Recognition , 2002, Lecture Notes in Computer Science.

[24]  Zoë Hoare,et al.  Landscapes of Naïve Bayes classifiers , 2007, Pattern Analysis and Applications.

[25]  Albert D. Carlson,et al.  The Handbook of Brain Theory and Neural Networks.Second Edition. Edited byMichael A Arbib.A Bradford Book. Cambridge (Massachusetts): MIT Press. $165.00. xvii + 1290 p; ill.; index. ISBN: 0–262–01197–2. 2003. , 2003 .

[26]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[27]  Kagan Tumer,et al.  Input decimated ensembles , 2003, Pattern Analysis & Applications.

[28]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[29]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[30]  David G. Stork,et al.  Pattern Classification , 1973 .

[31]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[32]  Sophie Midenet,et al.  SIGNALIZED INTERSECTION WITH REAL-TIME ADAPTIVE CONTROL: ON-FIELD ASSESSMENT OF CO2 AND POLLUTANT EMISSION REDUCTION , 2004 .

[33]  Theo Tryfonas,et al.  Frontiers in Artificial Intelligence and Applications , 2009 .

[34]  Raymond J. Mooney,et al.  Diverse ensembles for active learning , 2004, ICML.

[35]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[36]  H. Wechsler,et al.  Learning from data streams via online transduction , 2004 .

[37]  Thomas G. Dietterich The Handbook of Brain Theory and Neural Networks , 2002 .

[38]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[39]  Nicolas Saunier,et al.  Exposure to lateral collision in signalized intersections with protected left turn under different traffic control strategies. , 2011, Accident; analysis and prevention.

[40]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[41]  Nicolas Saunier,et al.  Stream-Based Learning through Data Selection in a Road Safety Application⋆ , 2007 .