Temporal Multi-View Inconsistency Detection for Network Traffic Analysis

In this paper, we investigate the problem of identifying inconsistent hosts in large-scale enterprise networks by mining multiple views of temporal data collected from the networks. The time-varying behavior of hosts is typically consistent across multiple views, and thus hosts that exhibit inconsistent behavior are possible anomalous points to be further investigated. To achieve this goal, we develop an effective approach that extracts common patterns hidden in multiple views and detects inconsistency by measuring the deviation from these common patterns. Specifically, we first apply various anomaly detectors on the raw data and form a three-way tensor (host, time, detector) for each view. We then develop a joint probabilistic tensor factorization method to derive the latent tensor subspace, which captures common time-varying behavior across views. Based on the extracted tensor subspace, an inconsistency score is calculated for each host that measures the deviation from common behavior. We demonstrate the effectiveness of the proposed approach on two enterprise-wide network-based anomaly detection tasks. An enterprise network consists of multiple hosts (servers, desktops, laptops) and each host sends/receives a time-varying number of bytes across network protocols (e.g.,TCP, UDP, ICMP) or send URL requests to DNS under various categories. The inconsistent behavior of a host is often a leading indicator of potential issues (e.g., instability, malicious behavior, or hardware malfunction). We perform experiments on real-world data collected from IBM enterprise networks, and demonstrate that the proposed method can find hosts with inconsistent behavior that are important to cybersecurity applications.

[1]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[2]  Ioannis Lambadaris,et al.  Network traffic anomaly detection using clustering techniques and performance comparison , 2013, 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE).

[3]  Jing Gao,et al.  Estimating Local Information Trustworthiness via Multi-source Joint Matrix Factorization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[4]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[5]  David J. Hill,et al.  Anomaly detection in streaming environmental sensor data: A data-driven modeling approach , 2010, Environ. Model. Softw..

[6]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[7]  Rituparna Chaki,et al.  State of the art analysis of network traffic anomaly detection , 2014, 2014 Applications and Innovations in Mobile Computing (AIMoC).

[8]  Matthew Roughan,et al.  IP forwarding anomalies and improving their detection using multiple data sources , 2004, NetT '04.

[9]  Deepak S. Turaga,et al.  Consensus extraction from heterogeneous detectors to improve performance over network traffic anomaly detection , 2011, 2011 Proceedings IEEE INFOCOM.

[10]  M. P. Mackrell,et al.  Discovering anomalous patterns in network traffic data during Crisis Events , 2013, 2013 IEEE Systems and Information Engineering Design Symposium.

[11]  Fei Wang,et al.  Semi-Supervised Clustering via Matrix Factorization , 2008, SDM.

[12]  Pierre Comon,et al.  General tensor decomposition, moment matrices and applications , 2013, J. Symb. Comput..

[13]  James Z. Zhang,et al.  Network traffic anomaly detection using weighted self-similarity based on EMD , 2013, 2013 Proceedings of IEEE Southeastcon.

[14]  Xi Chen,et al.  Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization , 2010, SDM.

[15]  Carlos García Garino,et al.  An autonomous labeling approach to support vector machines algorithms for network traffic anomaly detection , 2012, Expert Syst. Appl..

[16]  Nong Ye,et al.  A Markov Chain Model of Temporal Behavior for Anomaly Detection , 2000 .

[17]  Liang Ge,et al.  Multi-source deep learning for information trustworthiness estimation , 2013, KDD.

[18]  Charu Agarwal,et al.  Outlier ensembles , 2013, ODD '13.

[19]  Carla E. Brodley,et al.  Temporal sequence learning and data reduction for anomaly detection , 1998, CCS '98.

[20]  Philip S. Yu,et al.  Tensor Analysis on Multi-aspect Streams , 2007 .

[21]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[22]  Hong Huang,et al.  Network Traffic Anomaly Detection , 2014, ArXiv.

[23]  Charu C. Aggarwal,et al.  Outlier ensembles: position paper , 2013, SKDD.

[24]  Anthony K. H. Tung,et al.  Mining top-n local outliers in large databases , 2001, KDD '01.

[25]  Charu C. Aggarwal,et al.  Factorized Similarity Learning in Networks , 2014, 2014 IEEE International Conference on Data Mining.

[26]  Mikkel N. Schmidt,et al.  Probabilistic non-negative tensor factorization using Markov chain Monte Carlo , 2009, 2009 17th European Signal Processing Conference.

[27]  Aidong Zhang,et al.  Analysis on Community Variational Trend in Dynamic Networks , 2014, CIKM.

[28]  Vir V. Phoha,et al.  K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods , 2007, IEEE Transactions on Knowledge and Data Engineering.

[29]  Deepak K. Agarwal,et al.  An empirical Bayes approach to detect anomalies in dynamic multidimensional arrays , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[30]  Nikos D. Sidiropoulos,et al.  Large Scale Tensor Decompositions: Algorithmic Developments and Applications , 2013, IEEE Data Eng. Bull..

[31]  Fei Wang,et al.  Believe It Today or Tomorrow? Detecting Untrustworthy Information from Dynamic Multi-Source Data , 2015, SDM.

[32]  Bernhard Plattner,et al.  Entropy based worm and anomaly detection in fast IP networks , 2005, 14th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprise (WETICE'05).