A multinomial logistic regression modeling approach for anomaly intrusion detection

Although researchers have long studied using statistical modeling techniques to detect anomaly intrusion and profile user behavior, the feasibility of applying multinomial logistic regression modeling to predict multi-attack types has not been addressed, and the risk factors associated with individual major attacks remain unclear. To address the gaps, this study used the KDD-cup 1999 data and bootstrap simulation method to fit 3000 multinomial logistic regression models with the most frequent attack types (probe, DoS, U2R, and R2L) as an unordered independent variable, and identified 13 risk factors that are statistically significantly associated with these attacks. These risk factors were then used to construct a final multinomial model that had an ROC area of 0.99 for detecting abnormal events. Compared with the top KDD-cup 1999 winning results that were based on a rule-based decision tree algorithm, the multinomial logistic model-based classification results had similar sensitivity values in detecting normal (98.3% vs. 99.5%), probe (85.6% vs. 83.3%), and DoS (97.2% vs. 97.1%); remarkably high sensitivity in U2R (25.9% vs. 13.2%) and R2L (11.2% vs. 8.4%); and a significantly lower overall misclassification rate (18.9% vs. 35.7%). The study emphasizes that the multinomial logistic regression modeling technique with the 13 risk factors provides a robust approach to detect anomaly intrusion.

[1]  Mian Zhou,et al.  Mining Frequency Content of Network Traffic for Intrusion Detection , 2003 .

[2]  Sushil Jajodia,et al.  Detecting Novel Network Intrusions Using Bayes Estimators , 2001, SDM.

[3]  Charles Elkan,et al.  Results of the KDD'99 classifier learning , 2000, SKDD.

[4]  Cannady,et al.  Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks , 2000 .

[5]  Salvatore J. Stolfo,et al.  Cost-based modeling for fraud and intrusion detection: results from the JAM project , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[6]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[7]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[8]  Jacinth Salome,et al.  Fuzzy Data Mining and Genetic Algorithms Applied to Intrusion Detection , 2007 .

[9]  ElkanCharles Results of the KDD'99 classifier learning , 2000 .

[10]  P. Zarembka Frontiers in econometrics , 1973 .

[11]  M. H. Hamza Proceedings of the IASTED International Conference on Communication, Network, and Information Security , 2003 .

[12]  Jim Alves-Foss,et al.  “ Low Cost ” Network Intrusion Detection , 2001 .

[13]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[14]  Robert K. Cunningham,et al.  Improving Intrusion Detection Performance using Keyword Selection and Neural Networks , 2000, Recent Advances in Intrusion Detection.

[15]  James Cannady The Application of Artificial Neural Networks to Misuse Detection : Initial Results , 2000 .

[16]  R. Sekar,et al.  A fast automaton-based method for detecting anomalous program behaviors , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[17]  Jim Alves-Foss,et al.  An empirical analysis of NATE: Network Analysis of Anomalous Traffic Events , 2002, NSPW '02.

[18]  Bo Gao,et al.  HMMs (Hidden Markov models) based on anomaly intrusion detection method , 2002, Proceedings. International Conference on Machine Learning and Cybernetics.

[19]  Sung-Bae Cho,et al.  Efficient anomaly detection by modeling privilege flows using hidden Markov model , 2003, Comput. Secur..

[20]  M. Shyu,et al.  A Novel Anomaly Detection Scheme Based on Principal Component Classifier , 2003 .

[21]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[22]  P LippmannRichard,et al.  Improving intrusion detection performance using keyword selection and neural networks , 2000 .

[23]  Susan M. Bridges,et al.  FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION , 2002 .

[24]  Li Jun,et al.  HIDE: a Hierarchical Network Intrusion Detection System Using Statistical Preprocessing and Neural Network Classification , 2001 .

[25]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[26]  Jonatan Gómez,et al.  Evolving Fuzzy Classifiers for Intrusion Detection , 2002 .

[27]  Susan M. Bridges,et al.  A FRAMEWORK FOR AN ADAPTIVE INTRUSION DETECTION SYSTEM WITH DATA MINING , 2001 .

[28]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[29]  Eulalia Szmidt,et al.  Fuzzy thinking. The new science of fuzzy logic , 1996 .