Traffic Aggregation for Malware Detection

Stealthy malware, such as botnets and spyware, are hard to detect because their activities are subtle and do not disrupt the network, in contrast to DoS attacks and aggressive worms. Stealthy malware, however, does communicate to exfiltrate data to the attacker, to receive the attacker's commands, or to carry out those commands. Moreover, since malware rarely infiltrates only a single host in a large enterprise, these communications should emerge from multiple hosts within coarse temporal proximity to one another. In this paper, we describe a system called Ti¾?md (pronounced "tamed") with which an enterprise can identify candidate groups of infected computers within its network. Ti¾?md accomplishes this by finding new communication "aggregates" involving multiple internal hosts, i.e., communication flows that share common characteristics. We describe characteristics for defining aggregates--including flows that communicate with the same external network, that share similar payload, and/or that involve internal hosts with similar software platforms--and justify their use in finding infected hosts. We also detail efficient algorithms employed by Ti¾?md for identifying such aggregates, and demonstrate a particular configuration of Ti¾?md that identifies new infections for multiple bot and spyware examples, within traces of traffic recorded at the edge of a university network. This is achieved even when the number of infected hosts comprise only about 0.0097% of all internal hosts in the network.

[1]  Guofei Gu,et al.  Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems , 2006, Sixth International Conference on Data Mining (ICDM'06).

[2]  S. Muthukrishnan,et al.  Detecting malicious network traffic using inverse distributions of packet contents , 2005, MineNet '05.

[3]  Salvatore J. Stolfo,et al.  Anagram: A Content Anomaly Detector Resistant to Mimicry Attack , 2006, RAID.

[4]  Neil Daswani,et al.  The Anatomy of Clickbot.A , 2007, HotBots.

[5]  Patrick D. McDaniel,et al.  Analysis of Communities of Interest in Data Networks , 2005, PAM.

[6]  Andreas Terzis,et al.  A multifaceted approach to understanding the botnet phenomenon , 2006, IMC '06.

[7]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[8]  Brian Rexroad,et al.  Wide-Scale Botnet Detection and Characterization , 2007, HotBots.

[9]  Konstantina Papagiannaki,et al.  Structural analysis of network traffic flows , 2004, SIGMETRICS '04/Performance '04.

[10]  Brent Byunghoon Kang,et al.  Peer-to-Peer Botnets: Overview and Case Study , 2007, HotBots.

[11]  Santosh S. Vempala,et al.  A divide-and-merge methodology for clustering , 2005, PODS '05.

[12]  Nevil Brownlee,et al.  RTFM: New Attributes for Traffic Flow Measurement , 1999, RFC.

[13]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[14]  Santosh S. Vempala,et al.  Filtering spam with behavioral blacklisting , 2007, CCS '07.

[15]  Stefan Savage,et al.  Inferring Internet denial-of-service activity , 2001, TOCS.

[16]  Salvatore J. Stolfo,et al.  Privacy-preserving payload-based correlation for accurate malicious traffic detection , 2006, LSAD '06.

[17]  Marina Vannucci,et al.  Detecting Traffic Anomalies Using Discrete Wavelet Transform , 2004, ICOIN.

[18]  Eddie Kohler,et al.  Observed Structure of Addresses in IP Traffic , 2002, IEEE/ACM Transactions on Networking.

[19]  John Aycock,et al.  Army of Botnets , 2007, NDSS.

[20]  Zhengyuan Zhu,et al.  Multivariate SVD Analyses For Network Anomaly Detection , 2005 .

[21]  Jon Crowcroft,et al.  Honeycomb , 2004, Comput. Commun. Rev..

[22]  Nick Feamster,et al.  Revealing Botnet Membership Using DNSBL Counter-Intelligence , 2006, SRUTI.

[23]  Farnam Jahanian,et al.  The Zombie Roundup: Understanding, Detecting, and Disrupting Botnets , 2005, SRUTI.

[24]  Niels Provos,et al.  Detecting Steganographic Content on the Internet , 2002, NDSS.

[25]  Vinod Yegneswaran,et al.  BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation , 2007, USENIX Security Symposium.

[26]  Michael K. Reiter,et al.  Seurat: A Pointillist Approach to Anomaly Detection , 2004, RAID.

[27]  Nick Feamster,et al.  Understanding the network-level behavior of spammers , 2006, SIGCOMM.

[28]  Jim Alves-Foss,et al.  NATE: Network Analysis ofAnomalousTrafficEvents, a low-cost approach , 2001 .

[29]  Guofei Gu,et al.  BotSniffer: Detecting Botnet Command and Control Channels in Network Traffic , 2008, NDSS.

[30]  Nevil Brownlee,et al.  Traffic Flow Measurement: Architecture , 1999, RFC.

[31]  James Newsome,et al.  Polygraph: automatically generating signatures for polymorphic worms , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[32]  Paul Barford,et al.  A signal analysis of network traffic anomalies , 2002, IMW '02.

[33]  Suresh Singh,et al.  An Algorithm for Anomaly-based Botnet Detection , 2006, SRUTI.

[34]  George Varghese,et al.  Automated Worm Fingerprinting , 2004, OSDI.

[35]  Graham Cormode,et al.  The string edit distance matching problem with moves , 2002, SODA '02.

[36]  Ping Wang,et al.  An Advanced Hybrid Peer-to-Peer Botnet , 2007, IEEE Transactions on Dependable and Secure Computing.

[37]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[38]  B. Karp,et al.  Autograph: Toward Automated, Distributed Worm Signature Detection , 2004, USENIX Security Symposium.

[39]  Thomas Dübendorfer,et al.  Analysis of Internet Relay Chat Usage by DDoS Zombies , .

[40]  Jim Alves-Foss,et al.  NATE: Network Analysis of Anomalous Traffic Events, a low-cost approach , 2001, NSPW '01.

[41]  W. Timothy Strayer,et al.  Using Machine Learning Techniques to Identify Botnet Traffic , 2006 .

[42]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[43]  Thorsten Holz,et al.  Rishi: Identify Bot Contaminated Hosts by IRC Nickname Evaluation , 2007, HotBots.