An ensemble learning approach for XSS attack detection with domain knowledge and threat intelligence

Abstract Cross-site scripting (XSS) attack is one of the most dangerous attacks for web security. Traditional XSS detection methods mainly focus on the vulnerability itself, relying on static analysis and dynamic analysis, which appear weak in defending the flood of various kinds of payloads. In this paper, the XSS attack detection method is proposed based on an ensemble learning approach which utilizes a set of Bayesian networks, and each Bayesian network is built with both domain knowledge and threat intelligence. Besides, an analysis method is proposed to further explain the results, which sorts nodes in the Bayesian network according to their influences on the output node. The results are explainable to the end users. To validate the proposed method, experiments are performed on a real-world dataset about the XSS attack. The results show the priority of the proposed method, especially when the number of attacks increases. Moreover, the node sorting results could help the security team to cope with the attack in time.

[1]  Tim Watson,et al.  A LogitBoost-Based Algorithm for Detecting Known and Unknown Web Attacks , 2017, IEEE Access.

[2]  Yun Zhou,et al.  An empirical study of Bayesian network parameter learning with monotonic influence constraints , 2016, Decis. Support Syst..

[3]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[4]  Mehryar Mohri,et al.  Ensemble Methods for Structured Prediction , 2014, ICML.

[5]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[6]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[7]  Eric W. Burger,et al.  Taxonomy Model for Cyber Threat Intelligence Information Exchange Technologies , 2014, WISCS '14.

[8]  Shuyuan Jin,et al.  XSS Vulnerability Detection Using Optimized Attack Vector Repertory , 2015, 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[9]  Christopher Krügel,et al.  Pixy: a static analysis tool for detecting Web application vulnerabilities , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[10]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[11]  Stefan Fenz,et al.  An ontology-based approach for constructing Bayesian networks , 2012, Data Knowl. Eng..

[12]  Yunchuan Guo,et al.  Cyber Attacks Prediction Model Based on Bayesian Network , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[13]  Moataz A. Ahmed,et al.  Multiple-path testing for cross site scripting using genetic algorithms , 2016, J. Syst. Archit..

[14]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[15]  Oliver Brdiczka,et al.  A Bayesian Network Model for Predicting Insider Threats , 2013, 2013 IEEE Security and Privacy Workshops.

[16]  Emil C. Lupu,et al.  Efficient Attack Graph Analysis through Approximate Inference , 2016, ACM Trans. Priv. Secur..

[17]  Marco Zaffalon,et al.  Min-BDeu and Max-BDeu Scores for Learning Bayesian Networks , 2014, Probabilistic Graphical Models.

[18]  Haralambos Mouratidis,et al.  Recommender Systems Meeting Security: From Product Recommendation to Cyber-Attack Prediction , 2017, EANN.

[19]  Yun Zhou,et al.  Probabilistic Graphical Models parameter learning with transferred prior and constraints , 2015, UAI 2015.

[20]  Peng Liu,et al.  Using Bayesian networks for cyber security analysis , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[21]  Dustin Burke,et al.  Behavioral analysis of botnets for threat intelligence , 2011, Information Systems and e-Business Management.

[22]  Giovanni Agosta,et al.  Automated Security Analysis of Dynamic Web Applications through Symbolic Code Execution , 2012, 2012 Ninth International Conference on Information Technology - New Generations.

[23]  S. Krishnaveni,et al.  Multiclass Classification of XSS Web Page Attack using Machine Learning Techniques , 2013 .

[24]  Yun Zhou,et al.  Cyber Security Inference Based on a Two-Level Bayesian Network Framework , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[25]  Fred Glover,et al.  Tabu Search: A Tutorial , 1990 .

[26]  Lwin Khin Shar,et al.  Automated removal of cross site scripting vulnerabilities in web applications , 2012, Inf. Softw. Technol..

[27]  Christopher Krügel,et al.  Cross Site Scripting Prevention with Dynamic Data Tainting and Static Analysis , 2007, NDSS.

[28]  Yun Zhou,et al.  Bayesian network approach to multinomial parameter learning using data and expert judgments , 2014, Int. J. Approx. Reason..

[29]  Wolter Pieters,et al.  Bayesian Network Models in Cyber Security: A Systematic Review , 2017, NordSec.

[30]  Marco Vieira,et al.  Analysis of Field Data on Web Security Vulnerabilities , 2014, IEEE Transactions on Dependable and Secure Computing.

[31]  Wiem Tounsi,et al.  A survey on technical threat intelligence in the age of sophisticated cyber attacks , 2018, Comput. Secur..

[32]  Jong Hyuk Park,et al.  XSSClassifier: An Efficient XSS Attack Detection Approach Based on Machine Learning Classifier on SNSs , 2017, J. Inf. Process. Syst..

[33]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[34]  Sanjay Rawat,et al.  KameleonFuzz: evolutionary fuzzing for black-box XSS detection , 2014, CODASPY '14.