Detecting a Variety of Long-Term Stealthy User Behaviors on High Speed Links

Monitoring user behaviors over high speed links is important for applications such as network anomaly detection. Previous work focuses on monitoring anomalies such as extremely frequent users occurring in a short timeslot such as 1 minute. Little attention has been paid to detect users with stealthy behaviors (e.g., persistent, co-occurrence, anti-co-occurrence, and periodic behaviors) over a long period of time at the timeslot granularity. Due to limited computation and storage resources on routers, it is prohibitive to collect massive network traffic in a long period of time. We develop an end-to-end method for solving challenges in both long-term online traffic collection and offline user behavior analysis. We conduct extensive experiments on a variety of real-world traffic to evaluate the performance of detecting persistent, co-occurrence, anti-co-occurrence, and periodic behaviors, and the results demonstrate that our method significantly outperforms state-of-the-art methods.

[1]  Nicolas Hohn,et al.  Inverting sampled traffic , 2003, IEEE/ACM Transactions on Networking.

[2]  S. Muthukrishnan,et al.  Heavy-Hitter Detection Entirely in the Data Plane , 2016, SOSR.

[3]  Abhishek Kumar,et al.  Data streaming algorithms for efficient and accurate estimation of flow size distribution , 2004, SIGMETRICS '04/Performance '04.

[4]  Min Chen,et al.  Persistent Spread Measurement for Big Network Data Based on Register Intersection , 2017, SIGMETRICS.

[5]  D. B. Preston Spectral Analysis and Time Series , 1983 .

[6]  Xin Jin,et al.  SketchVisor: Robust Network Measurement for Software Packet Processing , 2017, SIGCOMM.

[7]  Shigang Chen,et al.  Estimating the Persistent Spreads in High-Speed Networks , 2014, 2014 IEEE 22nd International Conference on Network Protocols.

[8]  Walid G. Aref,et al.  Multiple and Partial Periodicity Mining in Time Series Databases , 2002, ECAI.

[9]  Jin Cao,et al.  Tracking Cardinality Distributions in Network Traffic , 2009, IEEE INFOCOM 2009.

[10]  Keqiu Li,et al.  Detection of Superpoints Using a Vector Bloom Filter , 2016, IEEE Transactions on Information Forensics and Security.

[11]  Carsten Lund,et al.  Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications , 2004, IMC '04.

[12]  Donald F. Towsley,et al.  A resource-minimalist flow size histogram estimator , 2008, IMC '08.

[13]  Balachander Krishnamurthy,et al.  Sketch-based change detection: methods, evaluation, and applications , 2003, IMC '03.

[14]  Lili Yang,et al.  Sampled Based Estimation of Network Traffic Flow Characteristics , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[15]  Vyas Sekar,et al.  Data streaming algorithms for estimating entropy of network traffic , 2006, SIGMETRICS '06/Performance '06.

[16]  Mitsuaki Akiyama,et al.  A Proposal of Metrics for Botnet Detection Based on Its Cooperative Behavior , 2007, 2007 International Symposium on Applications and the Internet Workshops.

[17]  George Varghese,et al.  Automatically inferring patterns of resource consumption in network traffic , 2003, SIGCOMM '03.

[18]  Nikos Mamoulis,et al.  Discovering Partial Periodic Patterns in Discrete Data Sequences , 2004, PAKDD.

[19]  Abhishek Kumar,et al.  Joint data streaming and sampling techniques for detection of super sources and destinations , 2005, IMC '05.

[20]  Philip S. Yu,et al.  InfoMiner+: mining partial periodic patterns with gap penalties , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[21]  Divesh Srivastava,et al.  Finding Hierarchical Heavy Hitters in Data Streams , 2003, VLDB.

[22]  Minlan Yu,et al.  FlowRadar: A Better NetFlow for Data Centers , 2016, NSDI.

[23]  Haipeng Dai,et al.  Finding Persistent Items in Data Streams , 2016, Proc. VLDB Endow..

[24]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[25]  Walid G. Aref,et al.  WARP: time warping for periodicity detection , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[26]  Ashwin Lall,et al.  A data streaming algorithm for estimating entropies of od flows , 2007, IMC '07.

[27]  Carsten Lund,et al.  Estimating flow distributions from sampled flow statistics , 2005, TNET.

[28]  Pere Barlet-Ros,et al.  Practical anomaly detection based on classifying frequent traffic patterns , 2012, 2012 Proceedings IEEE INFOCOM Workshops.

[29]  Minlan Yu,et al.  Software Defined Traffic Measurement with OpenSketch , 2013, NSDI.

[30]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[31]  Ming Zhang,et al.  Scan detection in high-speed networks based on optimal dynamic bit sharing , 2011, 2011 Proceedings IEEE INFOCOM.

[32]  Divesh Srivastava,et al.  Diamond in the rough: finding Hierarchical Heavy Hitters in multi-dimensional data , 2004, SIGMOD '04.

[33]  Philip S. Yu,et al.  Mining asynchronous periodic patterns in time series data , 2000, KDD '00.

[34]  William Stallings,et al.  Cryptography and network security - principles and practice (3. ed.) , 2014 .

[35]  Christopher Krügel,et al.  JACKSTRAWS: Picking Command and Control Connections from Bot Traffic , 2011, USENIX Security Symposium.

[36]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[37]  Konstantina Papagiannaki,et al.  Exploiting Temporal Persistence to Detect Covert Botnet Channels , 2009, RAID.

[38]  Jing Tao,et al.  Mining repeating pattern in packet arrivals: Metrics, models, and applications , 2017, Inf. Sci..

[39]  Nicole Immorlica,et al.  Click Fraud Resistant Methods for Learning Click-Through Rates , 2005, WINE.

[40]  R. Sekar,et al.  Dataflow anomaly detection , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[41]  Dong Zhou,et al.  Scaling Up Clustered Network Appliances with ScaleBricks , 2015, SIGCOMM.

[42]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[43]  Jiawei Han,et al.  ePeriodicity: Mining Event Periodicity from Incomplete Observations , 2015, IEEE Transactions on Knowledge and Data Engineering.

[44]  Donald F. Towsley,et al.  A new virtual indexing method for measuring host connection degrees , 2011, 2011 Proceedings IEEE INFOCOM.

[45]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[46]  George Varghese,et al.  New directions in traffic measurement and accounting , 2002, CCRV.

[47]  Jing Tao,et al.  A New Sketch Method for Measuring Host Connection Degree Distribution , 2014, IEEE Transactions on Information Forensics and Security.

[48]  Witold Kinsner,et al.  Detecting Advanced Persistent Threats using Fractal Dimension based Machine Learning Classification , 2016, IWSPA@CODASPY.

[49]  Tao Qin,et al.  A Data Streaming Method for Monitoring Host Connection Degrees of High-Speed Links , 2011, IEEE Transactions on Information Forensics and Security.

[50]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[51]  William Stallings,et al.  Cryptography and Network Security: Principles and Practice , 1998 .

[52]  Vyas Sekar,et al.  An empirical evaluation of entropy-based traffic anomaly detection , 2008, IMC '08.

[53]  Abhishek Kumar,et al.  A data streaming algorithm for estimating subpopulation flow size distribution , 2005, SIGMETRICS '05.

[54]  Graham Cormode,et al.  What's hot and what's not: tracking most frequent items dynamically , 2003, PODS '03.

[55]  Davide Balzarotti,et al.  Behind the Scenes of Online Attacks: an Analysis of Exploitation Behaviors on the Web , 2013, NDSS.

[56]  Philip S. Yu,et al.  Infominer: mining surprising periodic patterns , 2001, KDD '01.

[57]  Abhishek Kumar,et al.  Sketch Guided Sampling - Using On-Line Estimates of Flow Size for Adaptive Data Collection , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[58]  Walid G. Aref,et al.  Periodicity detection in time series databases , 2005, IEEE Transactions on Knowledge and Data Engineering.

[59]  Christos Faloutsos,et al.  Catching Synchronized Behaviors in Large Networks , 2016, ACM Trans. Knowl. Discov. Data.

[60]  Jing Cao,et al.  Identifying High Cardinality Internet Hosts , 2009, IEEE INFOCOM 2009.

[61]  Piotr Indyk,et al.  Identifying Representative Trends in Massive Time Series Data Sets Using Sketches , 2000, VLDB.

[62]  Jiawei Han,et al.  Mining Segment-Wise Periodic Patterns in Time-Related Databases , 1998, KDD.

[63]  Naren Ramakrishnan,et al.  Unearthing Stealthy Program Attacks Buried in Extremely Long Execution Paths , 2015, CCS.