A Toolset for Intrusion and Insider Threat Detection

Company data are a valuable asset and must be protected against unauthorized access and manipulation. In this contribution, we report on our ongoing work that aims to support IT security experts with identifying novel or obfuscated attacks in company networks, irrespective of their origin inside or outside the company network. A new toolset for anomaly based network intrusion detection is proposed. This toolset uses flow-based data which can be easily retrieved by central network components. We study the challenges of analysing flow-based data streams using data mining algorithms and build an appropriate approach step by step. In contrast to previous work, we collect flow-based data for each host over a certain time window, include the knowledge of domain experts and analyse the data from three different views. We argue that incorporating expert knowledge and previous flows allow us to create more meaningful attributes for subsequent analysis methods. This way, we try to detect novel attacks while simultaneously limiting the number of false positives.

[1]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Yudha Purwanto,et al.  DDoS detection using modified K-means clustering with chain initialization over landmark window , 2015, 2015 International Conference on Control, Electronics, Renewable Energy and Communications (ICCEREC).

[3]  Michal Pechoucek,et al.  Adaptive Multiagent System for Network Traffic Monitoring , 2009, IEEE Intelligent Systems.

[4]  Brett J. Borghetti,et al.  A Survey of Distance and Similarity Measures Used Within Network Intrusion Anomaly Detection , 2015, IEEE Communications Surveys & Tutorials.

[5]  Andreas Hotho,et al.  ConDist: A Context-Driven Categorical Distance Measure , 2015, ECML/PKDD.

[6]  Hui Lin,et al.  A density-based clustering over evolving heterogeneous data stream , 2009, 2009 ISECS International Colloquium on Computing, Communication, Control, and Management.

[7]  Emin Anarim,et al.  An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks , 2005, Expert Syst. Appl..

[8]  Jiankun Hu,et al.  A Real-Time NetFlow-based Intrusion Detection System with Improved BBNN and High-Frequency Field Programmable Gate Arrays , 2012, 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications.

[9]  Chang-Hwan Lee A Hellinger-based discretization method for numeric attributes in classification learning , 2007, Knowl. Based Syst..

[10]  Hari Balakrishnan,et al.  Fast portscan detection using sequential hypothesis testing , 2004, IEEE Symposium on Security and Privacy, 2004. Proceedings. 2004.

[11]  Taghi M. Khoshgoftaar,et al.  Detection of SSH Brute Force Attacks Using Aggregated Netflow Data , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[12]  Ali A. Ghorbani,et al.  Comparative Study of Supervised Machine Learning Techniques for Intrusion Detection , 2007, Fifth Annual Conference on Communication Networks and Services Research (CNSR '07).

[13]  Dae-Ki Kang,et al.  Learning classifiers for misuse and anomaly detection using a bag of system calls representation , 2005, Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop.

[14]  Jie Zhou,et al.  HClustream: A Novel Approach for Clustering Evolving Heterogeneous Data Stream , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[15]  Dario Rossi,et al.  Fine-grained traffic classification with netflow data , 2010, IWCMC.

[16]  Vyas Sekar,et al.  An empirical evaluation of entropy-based traffic anomaly detection , 2008, IMC '08.

[17]  Aoying Zhou,et al.  Density-Based Clustering over an Evolving Data Stream with Noise , 2006, SDM.

[18]  E.Y. Chen,et al.  Detecting DoS attacks on SIP systems , 2006, 1st IEEE Workshop on VoIP Management and Security, 2006..

[19]  Rupali Datti,et al.  Feature Reduction for Intrusion Detection Using Linear Discriminant Analysis , 2010 .

[20]  Jugal K. Kalita,et al.  Network Anomaly Detection: Methods, Systems and Tools , 2014, IEEE Communications Surveys & Tutorials.

[21]  M.-C. Su,et al.  A new cluster validity measure and its application to image compression , 2004, Pattern Analysis and Applications.

[22]  Hee-su Chae,et al.  Feature Selection for Intrusion Detection using NSL-KDD , 2013 .

[23]  Taghi M. Khoshgoftaar,et al.  A New Intrusion Detection Benchmarking System , 2015, FLAIRS Conference.

[24]  Kristopher Kendall,et al.  A Database of Computer Attacks for the Evaluation of Intrusion Detection Systems , 1999 .

[25]  Fabio Roli,et al.  Intrusion detection in computer networks by a modular ensemble of one-class classifiers , 2008, Inf. Fusion.

[26]  Tao Ye,et al.  Connectionless port scan detection on the backbone , 2006, 2006 IEEE International Performance Computing and Communications Conference.

[27]  Vanessa Hertzog,et al.  Counter Hack Reloaded A Step By Step Guide To Computer Attacks And Effective Defenses , 2016 .

[28]  Radu State,et al.  Machine Learning Approach for IP-Flow Record Anomaly Detection , 2011, Networking.

[29]  Dario Rossi,et al.  Reviewing Traffic Classification , 2013, Data Traffic Monitoring and Analysis.

[30]  Aiko Pras,et al.  A Labeled Data Set for Flow-Based Intrusion Detection , 2009, IPOM.

[31]  Philipp Winter,et al.  Inductive Intrusion Detection in Flow-Based Network Data Using One-Class Support Vector Machines , 2011, 2011 4th IFIP International Conference on New Technologies, Mobility and Security.

[32]  M. Malowidzki,et al.  Network Intrusion Detection : Half a Kingdom for a Good Dataset , 2015 .

[33]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[34]  Taghi M. Khoshgoftaar,et al.  RUDY Attack: Detection at the Network Level and Its Important Features , 2016, FLAIRS.

[35]  Martin Reh CAMNEP: An intrusion detection system for high- speed networks , 2008 .

[36]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[37]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[38]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[39]  Taghi M. Khoshgoftaar,et al.  A Session Based Approach for Aggregating Network Traffic Data -- The SANTA Dataset , 2014, 2014 IEEE International Conference on Bioinformatics and Bioengineering.

[40]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[41]  Katerina Goseva-Popstojanova,et al.  Using Multiclass Machine Learning Methods to Classify Malicious Behaviors Aimed at Web Systems , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering.

[42]  Benoit Claise,et al.  Cisco Systems NetFlow Services Export Version 9 , 2004, RFC.

[43]  Stuart Staniford-Chen,et al.  Practical Automated Detection of Stealthy Portscans , 2002, J. Comput. Secur..

[44]  A. K. Agarwal,et al.  Reliable Alert Fusion of Multiple Intrusion Detection Systems , 2017, Int. J. Netw. Secur..

[45]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[46]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[47]  Maurizio Dusi,et al.  Estimating routing symmetry on single links by passive flow measurements , 2010, IWCMC.

[48]  Aiko Pras,et al.  SSHCure: A Flow-Based SSH Intrusion Detection System , 2012, AIMS.

[49]  Guozhu Dong,et al.  CPCQ: Contrast pattern based clustering quality index for categorical data , 2012, Pattern Recognit..

[50]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[51]  Alejandro Zunino,et al.  An empirical comparison of botnet detection methods , 2014, Comput. Secur..

[52]  Benoit Claise,et al.  Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information , 2008, RFC.

[53]  Peter Phaal,et al.  InMon Corporation's sFlow: A Method for Monitoring Traffic in Switched and Routed Networks , 2001, RFC.

[54]  Erhan Guven,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2016, IEEE Communications Surveys & Tutorials.

[55]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[56]  Taghi M. Khoshgoftaar,et al.  Machine Learning for Detecting Brute Force Attacks at the Network Level , 2014, 2014 IEEE International Conference on Bioinformatics and Bioengineering.

[57]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[58]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[59]  Akhilesh Tiwari,et al.  A Rough Set Based Feature Selection on KDD CUP 99 Data Set , 2015 .

[60]  Dieter Landes,et al.  Identifying Suspicious Activities in Company Networks Through Data Mining and Visualization , 2013 .

[61]  Thomas Seidl,et al.  Internal Clustering Evaluation of Data Streams , 2015, PAKDD Workshops.

[62]  Fionn Murtagh,et al.  Algorithms for hierarchical clustering: an overview , 2012, WIREs Data Mining Knowl. Discov..

[63]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..

[64]  Babak Sadeghiyan,et al.  An Architecture for Host-based Intrusion Detection Systems using Fuzzy Logic , 2015 .

[65]  Florian Otto Creation of specific flow-based training data sets for usage behaviour classification , 2016 .