Network Intrusion Dataset Assessment

Abstract : Research into classification using Anomaly Detection (AD) within the field of Network Intrusion Detection (NID), or Network Intrusion Anomaly Detection (NIAD), is common, but operational use of the classifiers discovered by research is not. One reason for the lack of operational use is most published testing of AD methods uses artificial datasets: making it difficult to determine how well published results apply to other datasets and the networks they represent. This research develops a method to predict the accuracy of an AD-based classifier when applied to a new dataset, based on the di erence between an already classified dataset and the new dataset. The resulting method does not accurately predict classifier accuracy, but does allow some information to be gained regarding the possible range of accuracy. Further refinement of this method could allow rapid operational application of new techniques within the NIAD field, and quick selection of the classifier(s) that will be most accurate for the network.

[1]  D. Freedman,et al.  On the histogram as a density estimator:L2 theory , 1981 .

[2]  E. Pianka,et al.  Niche overlap and diffuse competition. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[3]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[4]  Balachander Krishnamurthy,et al.  A Socratic method for validation of measurement-based networking research , 2011, Comput. Commun..

[5]  George Varghese,et al.  Detecting evasion attacks at high speeds without reassembly , 2006, SIGCOMM 2006.

[6]  Hiroki Takakura,et al.  Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation , 2011, BADGERS '11.

[7]  S. Wierzchon,et al.  On the distance norms for detecting anomalies in multidimensional datasets , 2007 .

[8]  Giovanni Vigna,et al.  Network intrusion detection: dead or alive? , 2010, ACSAC '10.

[9]  Andrew J. Clark,et al.  Data preprocessing for anomaly based network intrusion detection: A review , 2011, Comput. Secur..

[10]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[11]  Shingo Mabu,et al.  A novel intrusion detection system based on the 2-dimensional space distribution of average matching degree , 2011, SICE Annual Conference 2011.

[12]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[13]  Kimberly C. Claffy,et al.  Dialing Privacy and Utility: A Proposed Data-Sharing Framework to Advance Internet Research , 2010, IEEE Security & Privacy.

[14]  Zhou Ji,et al.  Estimating the detector coverage in a negative selection algorithm , 2005, GECCO '05.

[15]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[16]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[17]  Rajdeep Niyogi,et al.  Data reduction by identification and correlation of TCP/IP attack attributes for network forensics , 2011, ICWET.

[18]  Bernice E. Rogowitz,et al.  Information exploration shootout project and benchmark data sets (panel): evaluating how visualization does in analyzing real-world data analysis problems , 1997 .

[19]  Padraig Cunningham,et al.  A Taxonomy of Similarity Mechanisms for Case-Based Reasoning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[20]  Amedeo Caflisch,et al.  Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Evangelos P. Markatos,et al.  Improving the accuracy of network intrusion detection systems under load using selective packet discarding , 2010, EUROSEC '10.

[22]  Judith Kelner,et al.  Better network traffic identification through the independent combination of techniques , 2010, J. Netw. Comput. Appl..

[23]  John Heidemann,et al.  Uses and Challenges for Network Datasets , 2009, 2009 Cybersecurity Applications & Technology Conference for Homeland Security.

[24]  Ali A. Ghorbani,et al.  VisVerND: Visual Verification of Network Traffic Dataset , 2011, 2011 Ninth Annual Communication Networks and Services Research Conference.

[25]  Zulaiha Ali Othman,et al.  Public domain datasets for optimizing network intrusion and machine learning approaches , 2011, 2011 3rd Conference on Data Mining and Optimization (DMO).

[26]  V Jyothsna,et al.  A Review of Anomaly based Intrusion Detection Systems , 2011 .

[27]  Taeshik Shon,et al.  A Network Data Abstraction Method for Data Set Verification , 2011, STA.

[28]  Ece Guran Schmidt,et al.  Machine learning algorithms for accurate flow-based network traffic classification: Evaluation and comparison , 2010, Perform. Evaluation.

[29]  Gary B. Lamont,et al.  Multi agent system for network attack classification using flow-based intrusion detection , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[30]  Boris Skoric,et al.  Towards an Information-Theoretic Framework for Analyzing Intrusion Detection Systems , 2006, ESORICS.

[31]  Yanghee Choi,et al.  NeTraMark: a network traffic classification benchmark , 2011, CCRV.

[32]  Tomas Olovsson,et al.  On collection of large-scale multi-purpose datasets on internet backbone links , 2011, BADGERS '11.

[33]  Riyad Alshammari,et al.  Investigating Two Different Approaches for Encrypted Traffic Classification , 2008, 2008 Sixth Annual Conference on Privacy, Security and Trust.

[34]  Guofei Gu,et al.  Measuring intrusion detection capability: an information-theoretic approach , 2006, ASIACCS '06.

[35]  J. Chow An Assessment of the DARPA IDS Evaluation Dataset Using Snort S Terry Brugger , 2005 .

[36]  Rung Ching Chen,et al.  Using Rough Set and Support Vector Machine for Network Intrusion Detection System , 2009, 2009 First Asian Conference on Intelligent Information and Database Systems.

[37]  Keith Phalp,et al.  Exploring discrepancies in findings obtained with the KDD Cup '99 data set , 2011, Intell. Data Anal..

[38]  Vishwas Sharma,et al.  Usefulness of DARPA dataset for intrusion detection system evaluation , 2008, SPIE Defense + Commercial Sensing.

[39]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[40]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[41]  Richard Lippmann,et al.  The 1999 DARPA off-line intrusion detection evaluation , 2000, Comput. Networks.

[42]  Shilpa Lakhina,et al.  Feature Reduction using Principal Component Analysis for Effective Anomaly – Based Intrusion Detection on NSL-KDD , 2010 .

[43]  Min Zhang,et al.  State of the Art in Traffic Classification: A Research Review , 2009 .

[44]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[45]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[46]  Ahmad Akbari,et al.  Improving Detection Rate in Intrusion Detection Systems Using FCM Clustering to Select Meaningful Landmarks in Incremental Landmark Isomap Algorithm , 2011 .

[47]  Heba F. Eid,et al.  Hybrid Intelligent Intrusion Detection Scheme , 2011 .

[48]  Ali A. Ghorbani,et al.  IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS 1 Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods , 2022 .

[49]  Nabendu Chaki,et al.  A State-of-the-art Survey on IDS for Mobile Ad-Hoc Networks and Wireless Mesh Networks , 2011, ArXiv.

[50]  Iftikhar Ahmad,et al.  A Review of Classification Approaches Using Support Vector Machine in Intrusion Detection , 2011 .