Anomaly Detection in Transportation Corridors using Manifold Embedding

The formation of secure transportation corridors, where cargoes and shipments from points of entry can be dispatched safely to highly sensitive and secure locations, is a high national priority. One of the key tasks of the program is the detection of anomalous cargo based on sensor readings in truck weigh stations. Due to the high variability, dimensionality, and/or noise content of sensor data in transportation corridors, appropriate feature representation is crucial to the success of anomaly detection methods in this domain. In this paper, we empirically investigate the usefulness of manifold embedding methods for feature representation in anomaly detection problems in the domain of transportation corridors. We focus on both linear methods, such as multi-dimensional scaling (MDS), as well as nonlinear methods, such as locally linear embedding (LLE) and isometric feature mapping (ISOMAP). Our study indicates that such embedding methods provide a natural mechanism for keeping anomalous points away from the dense/normal regions in the embedding of the data. We illustrate the efficacy of manifold embedding methods for anomaly detection through experiments on simulated data as well as real truck data from weigh stations.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[3]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[4]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[5]  Robert P. W. Duin,et al.  Data domain description using support vectors , 1999, ESANN.

[6]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[7]  Manfred K. Warmuth,et al.  Relative Expected Instantaneous Loss Bounds , 2000, J. Comput. Syst. Sci..

[8]  Maja J. Mataric,et al.  A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.

[9]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[10]  David J. Marchette A Statistical Method for Profiling Network Traffic , 1999, Workshop on Intrusion Detection and Network Monitoring.

[11]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[12]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[13]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[14]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[15]  Dejan Kulpinski LLE and Isomap analysis of spectra and colour images , 2002 .

[16]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[17]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[18]  Jing Zhang,et al.  Factor analysis based anomaly detection , 2003, IEEE Systems, Man and Cybernetics SocietyInformation Assurance Workshop, 2003..

[19]  Ricardo Vilalta,et al.  Predicting rare events in temporal domains , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[20]  Alfred O. Hero,et al.  Manifold learning visualization of network traffic data , 2005, MineNet '05.

[21]  D. H. Kim,et al.  Hyperspectral image processing using locally linear embedding , 2003, First International IEEE EMBS Conference on Neural Engineering, 2003. Conference Proceedings..

[22]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[23]  I. Jolliffe Principal Component Analysis , 2002 .

[24]  David G. Stork,et al.  Pattern Classification , 1973 .

[25]  Jing Zhang,et al.  Factor-analysis based anomaly detection and clustering , 2006, Decis. Support Syst..

[26]  Alex M. Andrew Human Factors in Multi-Crew Flight Operations, by Harry W. Orlady and Linda M. Orlady, Ashgate, Aldershot, 1999, xx+623 pp., ISBN 0-291-39838-3 (hardback), 0-291-39839-1 (paperback, £25) , 2000, Robotica.

[27]  Robert Pless,et al.  Image spaces and video trajectories: using Isomap to explore video sequences , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[28]  VARUN CHANDOLA,et al.  Outlier Detection : A Survey , 2007 .

[29]  Tat-Jun Chin,et al.  Locally Linear Embedding for Markerless Human Motion Capture Using Multiple Cameras , 2005, Digital Image Computing: Techniques and Applications (DICTA'05).

[30]  Hongyuan Zha,et al.  Isometric Embedding and Continuum ISOMAP , 2003, ICML.

[31]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[32]  I. Hassan Embedded , 2005, The Cyber Security Handbook.

[33]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[34]  Xin Yang,et al.  Semi-supervised nonlinear dimensionality reduction , 2006, ICML.

[35]  D. D. Ridder,et al.  Locally linear embedding for classification , 2002 .

[36]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[37]  Yi Li,et al.  Bootstrapping a data mining intrusion detection system , 2003, SAC '03.