Detecting Flow Anomalies in Distributed Systems

Deep within the networks of distributed systems, one often finds anomalies that affect their efficiency and performance. These anomalies are difficult to detect because the distributed systems may not have sufficient sensors to monitor the flow of traffic within the interconnected nodes of the networks. Without early detection and making corrections, these anomalies may aggravate over time and could possibly cause disastrous outcomes in the system in the unforeseeable future. Using only coarse-grained information from the two end points of network flows, we propose a network transmission model and a localization algorithm, to detect the location of anomalies and rank them using a proposed metric within distributed systems. We evaluate our approach on passengers' records of an urbanized city's public transportation system and correlate our findings with passengers' postings on social media micro blogs. Our experiments show that the metric derived using our localization algorithm gives a better ranking of anomalies as compared to standard deviation measures from statistical models. Our case studies also demonstrate that transportation events reported in social media micro blogs matches the locations of our detect anomalies, suggesting that our algorithm performs well in locating the anomalies within distributed systems.

[1]  Xenofontas A. Dimitropoulos,et al.  Histogram-based traffic anomaly detection , 2009, IEEE Transactions on Network and Service Management.

[2]  Hui Xiong,et al.  A Taxi Driving Fraud Detection System , 2011, 2011 IEEE 11th International Conference on Data Mining.

[3]  Sanjay Chawla,et al.  Inferring the Root Cause in Road Traffic Anomalies , 2012, 2012 IEEE 12th International Conference on Data Mining.

[4]  ParthasarathySrinivasan,et al.  Fast Distributed Outlier Detection in Mixed-Attribute Data Sets , 2006 .

[5]  Sushil Jajodia,et al.  Online detection of network traffic anomalies using behavioral distance , 2009, 2009 17th International Workshop on Quality of Service.

[6]  Lada A. Adamic,et al.  Information flow in social groups , 2003, cond-mat/0305305.

[7]  Ee-Peng Lim,et al.  Virality and Susceptibility in Information Diffusions , 2012, ICWSM.

[8]  Yuh-Jye Lee,et al.  Anomaly detection on ITS data via view association , 2013, ODD '13.

[9]  Dit-Yan Yeung,et al.  Parzen-window network intrusion detectors , 2002, Object recognition supported by user interaction for service robots.

[10]  Nicholas Jing Yuan,et al.  Reconstructing Individual Mobility from Smart Card Transactions: A Space Alignment Approach , 2013, 2013 IEEE 13th International Conference on Data Mining.

[11]  Jianting Zhang Smarter outlier detection and deeper understanding of large-scale taxi trip records: a case study of NYC , 2012, UrbComp '12.

[12]  Qiang Fu,et al.  Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[13]  Mario Vento,et al.  To reject or not to reject: that is the question-an answer in case of neural classifiers , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[14]  Arindam Banerjee,et al.  Anomaly detection using manifold embedding and its applications in transportation corridors , 2009, Intell. Data Anal..

[15]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[16]  Shin Ando,et al.  Clustering Needles in a Haystack: An Information Theoretic Analysis of Minority and Outlier Detection , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[17]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[18]  Marina Thottan,et al.  Anomaly detection in IP networks , 2003, IEEE Trans. Signal Process..

[19]  Cyrus Shahabi,et al.  Crowd sensing of traffic anomalies based on human mobility and social media , 2013, SIGSPATIAL/GIS.

[20]  Christoph H. Loch,et al.  Hierarchical Structure and Search in Complex Organizations , 2010, Manag. Sci..

[21]  Zhi-Hua Zhou,et al.  iBAT: detecting anomalous taxi trajectories from GPS traces , 2011, UbiComp '11.

[22]  Srinivasan Parthasarathy,et al.  Fast Distributed Outlier Detection in Mixed-Attribute Data Sets , 2006, Data Mining and Knowledge Discovery.

[23]  Eleazar Eskin,et al.  Anomaly Detection over Noisy Data using Learned Probability Distributions , 2000, ICML.

[24]  Xing Xie,et al.  Discovering spatio-temporal causal interactions in traffic data streams , 2011, KDD.

[25]  Sushil Jajodia,et al.  Detecting Novel Network Intrusions Using Bayes Estimators , 2001, SDM.