Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks

As more and more Internet-based attacks arise, organizations are responding by deploying an assortment of security products that generate situational intelligence in the form of logs. These logs often contain high volumes of interesting and useful information about activities in the network, and are among the first data sources that information security specialists consult when they suspect that an attack has taken place. However, security products often come from a patchwork of vendors, and are inconsistently installed and administered. They generate logs whose formats differ widely and that are often incomplete, mutually contradictory, and very large in volume. Hence, although this collected information is useful, it is often dirty. We present a novel system, Beehive, that attacks the problem of automatically mining and extracting knowledge from the dirty log data produced by a wide variety of security products in a large enterprise. We improve on signature-based approaches to detecting security incidents and instead identify suspicious host behaviors that Beehive reports as potential security incidents. These incidents can then be further analyzed by incident response teams to determine whether a policy violation or attack has occurred. We have evaluated Beehive on the log data collected in a large enterprise, EMC, over a period of two weeks. We compare the incidents identified by Beehive against enterprise Security Operations Center reports, antivirus software alerts, and feedback from enterprise security specialists. We show that Beehive is able to identify malicious events and policy violations which would otherwise go undetected.

[1]  Michael K. Reiter,et al.  Traffic Aggregation for Malware Detection , 2008, DIMVA.

[2]  Leyla Bilge,et al.  EXPOSURE: Finding Malicious Domains Using Passive DNS Analysis , 2011, NDSS.

[3]  I. Jolliffe Principal Component Analysis , 2002 .

[4]  Nitesh V. Chawla,et al.  Authentication anomaly detection: a case study on a virtual private network , 2007, MineNet '07.

[5]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[6]  Henry L. Owen,et al.  The use of Honeynets to detect exploited systems across large enterprise networks , 2003, IEEE Systems, Man and Cybernetics SocietyInformation Assurance Workshop, 2003..

[7]  Thorsten Holz,et al.  As the net churns: Fast-flux botnet observations , 2008, 2008 3rd International Conference on Malicious and Unwanted Software (MALWARE).

[8]  Nick Feamster,et al.  Building a Dynamic Reputation System for DNS , 2010, USENIX Security Symposium.

[9]  Arvind Krishnamurthy,et al.  Studying Spamming Botnets Using Botlab , 2009, NSDI.

[10]  Kavé Salamatian,et al.  Anomaly extraction in backbone networks using association rules , 2009, IMC '09.

[11]  Sandeep Yadav,et al.  Detecting algorithmically generated malicious domain names , 2010, IMC '10.

[12]  Andreas Terzis,et al.  A multifaceted approach to understanding the botnet phenomenon , 2006, IMC '06.

[13]  Farnam Jahanian,et al.  The Zombie Roundup: Understanding, Detecting, and Disrupting Botnets , 2005, SRUTI.

[14]  Felix C. Freiling,et al.  Measuring and Detecting Fast-Flux Service Networks , 2008, NDSS.

[15]  Felix C. Freiling,et al.  Botnet Tracking: Exploring a Root-Cause Methodology to Prevent Distributed Denial-of-Service Attacks , 2005, ESORICS.

[16]  Bernhard Plattner,et al.  Entropy based worm and anomaly detection in fast IP networks , 2005, 14th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprise (WETICE'05).

[17]  Brian Rexroad,et al.  Wide-Scale Botnet Detection and Characterization , 2007, HotBots.

[18]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[19]  William H. Sanders,et al.  Safeguarding academic accounts and resources with the University Credential Abuse Auditing System , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).

[20]  Christopher Krügel,et al.  Your botnet is my botnet: analysis of a botnet takeover , 2009, CCS.

[21]  Charu C. Aggarwal,et al.  An Introduction to Cluster Analysis , 2018, Data Clustering: Algorithms and Applications.

[22]  W. Timothy Strayer,et al.  Using Machine Learning Techniques to Identify Botnet Traffic , 2006 .

[23]  Vinod Yegneswaran,et al.  BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation , 2007, USENIX Security Symposium.

[24]  Nick Feamster,et al.  Understanding the network-level behavior of spammers , 2006, SIGCOMM.

[25]  José Carlos Brustoloni,et al.  Bayesian bot detection based on DNS traffic similarity , 2009, SAC '09.

[26]  Radu State,et al.  BotTrack: Tracking Botnets Using NetFlow and PageRank , 2011, Networking.

[27]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[28]  Wenke Lee,et al.  Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces , 2009, 2009 Annual Computer Security Applications Conference.

[29]  Guofei Gu,et al.  BotSniffer: Detecting Botnet Command and Control Channels in Network Traffic , 2008, NDSS.

[30]  Leyla Bilge,et al.  Disclosure: detecting botnet command and control servers through large-scale NetFlow analysis , 2012, ACSAC '12.

[31]  Leyla Bilge,et al.  Exposure: A Passive DNS Analysis Service to Detect and Report Malicious Domains , 2014, TSEC.

[32]  Aiko Pras,et al.  Anomaly Characterization in Flow-Based Traffic Time Series , 2008, IPOM.

[33]  Heejo Lee,et al.  Botnet Detection by Monitoring Group Activities in DNS Traffic , 2007, 7th IEEE International Conference on Computer and Information Technology (CIT 2007).

[34]  Lorenzo Martignoni,et al.  FluXOR: Detecting and Monitoring Fast-Flux Service Networks , 2008, DIMVA.

[35]  Kensuke Fukuda,et al.  Extracting hidden anomalies using sketch and non Gaussian multiresolution statistical detection procedures , 2007, LSAD '07.

[36]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[37]  W. Timothy Strayer,et al.  Detecting Botnets with Tight Command and Control , 2006, Proceedings. 2006 31st IEEE Conference on Local Computer Networks.

[38]  Sandeep Yadav,et al.  Winning with DNS Failures: Strategies for Faster Botnet Detection , 2011, SecureComm.

[39]  Suresh Singh,et al.  An Algorithm for Anomaly-based Botnet Detection , 2006, SRUTI.

[40]  Wenke Lee,et al.  Detecting Malware Domains at the Upper DNS Hierarchy , 2011, USENIX Security Symposium.