Using Principal Component Analysis to Solve a Class Imbalance Problem in Traffic Incident Detection

High imbalances occur in real-world situations when a detection system needs to identify the rare but important event of a traffic incident. Traffic incident detection can be treated as a task of learning classifiers from imbalanced or skewed datasets. Using principal component analysis (PCA), a one-class classifier for incident detection is constructed from the major and minor principal components of normal instances. Experiments are conducted with a real traffic dataset collected from the A12 highway in The Netherlands. The parameters setting, including the significance level, the percentage of the total variation explained, and the upper bound of the eigenvalues for the minor components, is discussed. The test results demonstrate that this method achieves better performance than partial least squares regression. The method is shown to be promising for traffic incident detection.

[1]  M. Shyu,et al.  A Novel Anomaly Detection Scheme Based on Principal Component Classifier , 2003 .

[2]  Shawn Turner,et al.  Archived Intelligent Transportation System Data Quality: Preliminary Analyses of San Antonio TransGuide Data , 2000 .

[3]  Fang Yuan,et al.  INCIDENT DETECTION USING SUPPORT VECTOR MACHINES , 2003 .

[4]  Wei Wang,et al.  Traffic Incident Detection Based on Rough Sets Approach , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[5]  Wei Wang,et al.  A comparison of outlier detection algorithms for ITS data , 2010, Expert Syst. Appl..

[6]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[7]  Wei Wang,et al.  Incident detection algorithm based on partial least squares regression , 2008 .

[8]  Bruce Hellinga,et al.  Automatic Vehicle Identification Technology-Based Freeway Incident Detection , 2000 .

[9]  Wei Wang,et al.  Decision tree learning for freeway automatic incident detection , 2009, Expert Syst. Appl..

[10]  Adam Kowalczyk,et al.  Extreme re-balancing for SVMs: a case study , 2004, SKDD.

[11]  Shehroz S. Khan,et al.  A Survey of Recent Trends in One Class Classification , 2009, AICS.

[12]  Douglas M. Hawkins,et al.  The Detection of Errors in Multivariate Data Using Principal Components , 1974 .

[13]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[14]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[15]  Wei Wang,et al.  Construct support vector machine ensemble to detect traffic incident , 2009, Expert Syst. Appl..

[16]  Xindong Wu,et al.  ACE: an aggressive classifier ensemble with error detection, correction and cleansing , 2005, 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05).

[17]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[18]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[19]  Nathalie Japkowicz,et al.  Concept-Learning in the Presence of Between-Class and Within-Class Imbalances , 2001, Canadian Conference on AI.

[20]  Vladimir Brusic,et al.  Data cleansing for computer models: a case study from immunology , 1999, ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378).

[21]  Dimitris Kanellopoulos,et al.  Handling imbalanced datasets: A review , 2006 .

[22]  Wei Wang,et al.  Application of Neural Network Ensembles to Incident Detection , 2007, 2007 IEEE International Conference on Integration Technology.

[23]  Dipti Srinivasan,et al.  Support vector machine models for freeway incident detection , 2003, Proceedings of the 2003 IEEE International Conference on Intelligent Transportation Systems.

[24]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[25]  Wei Wang,et al.  Comparison between Partial Least Squares Regression and Support Vector Machine for Freeway Incident Detection , 2007, 2007 IEEE Intelligent Transportation Systems Conference.