K-NN Based Outlier Detection Technique on Intrusion Dataset

Outliers in the database are the objects that deviate from the rest of the dataset by some measure. The Nearest Neighbor Outlier Factor is considering to measure the degree of outlier-ness of the object in the dataset. Unlike the other methods like Local Outlier Factor, this approach shows the interest of a point from both neighbors and reverse neighbors, and after that, an object comes into consideration. We have observed that in GBBK algorithm that based on K-NN, used quick sort to find k nearest neighbors that take O N log N time. However, in proposed method, the time required for searching on K times which complete in O KN time to find k nearest neighbors k < < log N. As a result, the proposed method improves the time complexity. The NSL-KDD and Fisher iris dataset is used, and experimental results compared with the GBBK method. The result is same in both the methods, but the proposed method takes less time for computation.

[1]  Hongxing He,et al.  A comparative study of RNN for outlier detection in data mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[2]  Rajeev Rastogi,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD 2000.

[3]  Anthony K. H. Tung,et al.  Ranking Outliers Using Symmetric Neighborhood Relationship , 2006, PAKDD.

[4]  Jugal K. Kalita,et al.  MLH-IDS: A Multi-Level Hybrid Intrusion Detection Method , 2014, Comput. J..

[5]  Sanjay Kumar Jena,et al.  A Study of K-Means and C-Means Clustering Algorithms for Intrusion Detection Product Development , 2014 .

[6]  Prasanta K. Jana,et al.  Uncertainty-Based QoS Min–Min Algorithm for Heterogeneous Multi-cloud Environment , 2016, Arabian Journal for Science and Engineering.

[7]  Philippe Owezarski,et al.  UNADA: Unsupervised Network Anomaly Detection Using Sub-space Outliers Ranking , 2011, Networking.

[8]  Sanjay Kumar Jena,et al.  A Multiclass SVM Classification Approach for Intrusion Detection , 2016, ICDCIT.

[9]  M.M. Deris,et al.  A Comparative Study for Outlier Detection Techniques in Data Mining , 2006, 2006 IEEE Conference on Cybernetics and Intelligent Systems.

[10]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[11]  Sukumar Nandi,et al.  NDoT: Nearest Neighbor Distance Based Outlier Detection Technique , 2011, PReMI.

[12]  Prasanta K. Jana,et al.  Normalization-Based Task Scheduling Algorithms for Heterogeneous Multi-Cloud Environment , 2016, Information Systems Frontiers.

[13]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[14]  Santosh Kumar Sahu,et al.  A detail analysis on intrusion detection datasets , 2014, 2014 IEEE International Advance Computing Conference (IACC).

[15]  Philip S. Yu,et al.  An effective and efficient algorithm for high-dimensional outlier detection , 2005, The VLDB Journal.

[16]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.