Mining Common Outliers for Intrusion Detection

Data mining for intrusion detection can be divided into several sub-topics, among which unsupervised clustering (which has controversial properties). Unsupervised clustering for intrusion detection aims to i) group behaviours together depending on their similarity and ii) detect groups containing only one (or very few) behaviour(s). Such isolated behaviours seem to deviate from the model of normality; therefore, they are considered as malicious. Obviously, not all atypical behaviours are attacks or intrusion attempts. This represents one drawback of intrusion detection methods based on clustering.We take into account the addition of a new feature to isolated behaviours before they are considered malicious. This feature is based on the possible repeated occurrences of the bahaviour on many information systems. Based on this feature, we propose a new outlier mining method which we validate through a set of experiments.

[1]  Takehisa Yairi,et al.  An approach to spacecraft anomaly detection problem using kernel feature space , 2005, KDD '05.

[2]  Salvatore J. Stolfo,et al.  A Geometric Framework for Unsupervised Anomaly Detection , 2002, Applications of Data Mining in Computer Security.

[3]  Taghi M. Khoshgoftaar,et al.  CLUSTERING-BASED NETWORK INTRUSION DETECTION , 2007 .

[4]  Bernd Freisleben,et al.  CARDWATCH: a neural network based database mining system for credit card fraud detection , 1997, Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr).

[5]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[6]  E. Bloedorn,et al.  Data mining for network intrusion detection : How to get started , 2001 .

[7]  Adam Vinueza,et al.  Unsupervised Outlier Detection and Semi-Supervised Learning , 2004 .

[8]  P. Sajda,et al.  Detection, synthesis and compression in mammographic image analysis with a hierarchical image probability model , 2001, Proceedings IEEE Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA 2001).

[9]  Abdul Hanan Abdullah,et al.  Unsupervised Anomaly Detection with Unlabeled Data Using Clustering , 2005 .

[10]  A. Hadi,et al.  BACON: blocked adaptive computationally efficient outlier nominators , 2000 .

[11]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[12]  Srinivasan Parthasarathy,et al.  Towards NIC-based intrusion detection , 2003, KDD '03.

[13]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[14]  Lian Duan,et al.  A Local Density Based Spatial Clustering Algorithm with Noise , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[15]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[16]  Alfonso Valdes,et al.  Probabilistic Alert Correlation , 2001, Recent Advances in Intrusion Detection.

[17]  R. Kwitt,et al.  Unsupervised Anomaly Detection in Network Traffic by Means of Robust PCA , 2007, 2007 International Multi-Conference on Computing in the Global Information Technology (ICCGI'07).

[18]  Florent Masseglia,et al.  Parameterless outlier detection in data streams , 2009, SAC '09.

[19]  Christos Faloutsos,et al.  LOCI: fast outlier detection using the local correlation integral , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[20]  Salvatore J. Stolfo,et al.  Collaborative Distributed Intrusion Detection , 2004 .

[21]  Sushil Jajodia,et al.  Applications of Data Mining in Computer Security , 2002, Advances in Information Security.

[22]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[23]  Osmar R. Zaïane,et al.  A Nonparametric Outlier Detection for Effectively Discovering Top-N Outliers from Engineering Data , 2006, PAKDD.

[24]  Lida Xu,et al.  A local-density based spatial clustering algorithm with noise , 2007, Inf. Syst..

[25]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[26]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .

[27]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[28]  Jing Zhang,et al.  Factor analysis based anomaly detection , 2003, IEEE Systems, Man and Cybernetics SocietyInformation Assurance Workshop, 2003..

[29]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[30]  Gregory Z. Grudic,et al.  Unsupervised Outlier Detection and Semi-Supervised Learning ; CU-CS-976-04 , 2004 .

[31]  Carla M. Santos-Pereira,et al.  Using Clustering and Robust Estimators to Detect Outliers in Multivariate Data. , 2005 .

[32]  Rajeev Rastogi,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD 2000.

[33]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .

[34]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[35]  Christopher Leckie,et al.  Adaptive Clustering for Network Intrusion Detection , 2004, PAKDD.

[36]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[37]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[38]  Somesh Jha,et al.  Global Intrusion Detection in the DOMINO Overlay System , 2004, NDSS.

[39]  Yelena Yesha,et al.  Data Mining: Next Generation Challenges and Future Directions , 2004 .

[40]  Anthony K. H. Tung,et al.  Mining top-n local outliers in large databases , 2001, KDD '01.

[41]  Dong Xiang,et al.  Information-theoretic measures for anomaly detection , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[42]  Florent Masseglia,et al.  Intrusion Detections in Collaborative Organizations by Preserving Privacy , 2009, EGC.

[43]  Sushil Jajodia,et al.  Detecting Novel Network Intrusions Using Bayes Estimators , 2001, SDM.

[44]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[45]  David J. Marchette A Statistical Method for Profiling Network Traffic , 1999, Workshop on Intrusion Detection and Network Monitoring.

[46]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[47]  Jaideep Srivastava,et al.  Data Mining for Network Intrusion Detection , 2002 .

[48]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[49]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.