DeepSQLi: deep semantic learning for testing SQL injection

Security is unarguably the most serious concern for Web applications, to which SQL injection (SQLi) attack is one of the most devastating attacks. Automatically testing SQLi vulnerabilities is of ultimate importance, yet is unfortunately far from trivial to implement. This is because the existence of a huge, or potentially infinite, number of variants and semantic possibilities of SQL leading to SQLi attacks on various Web applications. In this paper, we propose a deep natural language processing based tool, dubbed DeepSQLi, to generate test cases for detecting SQLi vulnerabilities. Through adopting deep learning based neural language model and sequence of words prediction, DeepSQLi is equipped with the ability to learn the semantic knowledge embedded in SQLi attacks, allowing it to translate user inputs (or a test case) into a new test case, which is se- mantically related and potentially more sophisticated. Experiments are conducted to compare DeepSQLi with SQLmap, a state-of-the-art SQLi testing automation tool, on six real-world Web applications that are of different scales, characteristics and domains. Empirical results demonstrate the effectiveness and the remarkable superiority of DeepSQLi over SQLmap, such that more SQLi vulnerabilities can be identified by using a less number of test cases, whilst running much faster.

[1]  Sam Kwong,et al.  A general framework for evolutionary multiobjective optimization via manifold learning , 2014, Neurocomputing.

[2]  Roberto Tronci,et al.  Machine Learning in Security Applications , 2015, Trans. Mach. Learn. Data Min..

[3]  Franz Wotawa,et al.  Evaluation of the IPO-Family algorithms for test case generation in web security testing , 2015, 2015 IEEE Eighth International Conference on Software Testing, Verification and Validation Workshops (ICSTW).

[4]  Ke Li,et al.  Progressive Preference Learning: Proof-of-Principle Results in MOEA/D , 2019, EMO.

[5]  Andrew M. Dai,et al.  Music Transformer: Generating Music with Long-Term Structure , 2018, ICLR.

[6]  Wei Tian,et al.  Attack Model Based Penetration Test for SQL Injection Vulnerability , 2012, 2012 IEEE 36th Annual Computer Software and Applications Conference Workshops.

[7]  Xin Yao,et al.  Variable Interaction in Multi-objective Optimization Problems , 2016, PPSN.

[8]  Jason Bau,et al.  Search-Based Security Testing of Web Applications , 2014 .

[9]  Dawn Xiaodong Song,et al.  A Machine Learning Approach to Prevent Malicious Calls over Telephony Networks , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[10]  Qingfu Zhang,et al.  Two-Level Stable Matching-Based Selection in MOEA/D , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[11]  Sam Kwong,et al.  Multi-objective differential evolution with self-navigation , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[12]  Qingfu Zhang,et al.  Matching-Based Selection With Incomplete Lists for Decomposition Multiobjective Optimization , 2016, IEEE Transactions on Evolutionary Computation.

[13]  Satish Kumar,et al.  Multi-Tenant Cloud Service Composition Using Evolutionary Optimization , 2018, 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS).

[14]  Kalyanmoy Deb,et al.  A dual-population paradigm for evolutionary multiobjective optimization , 2015, Inf. Sci..

[15]  Tao Chen,et al.  Security testing of web applications: a search-based approach for detecting SQL injection vulnerabilities , 2019, GECCO.

[16]  Jinhua Zheng,et al.  Achieving balance between proximity and diversity in multi-objective evolutionary algorithm , 2012, Inf. Sci..

[17]  Yorick Wilks,et al.  A Closer Look at Skip-gram Modelling , 2006, LREC.

[18]  Nick Feamster,et al.  Machine Learning DDoS Detection for Consumer Internet of Things Devices , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[19]  Qingfu Zhang,et al.  Evolutionary multiobjective optimization with hybrid selection principles , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[20]  Xin Yao,et al.  Dynamic Multiobjectives Optimization With a Changing Number of Objectives , 2016, IEEE Transactions on Evolutionary Computation.

[21]  Sam Kwong,et al.  A weighted voting method using minimum square error based on Extreme Learning Machine , 2012, 2012 International Conference on Machine Learning and Cybernetics.

[22]  Qingfu Zhang,et al.  Learning to Decompose: A Paradigm for Decomposition-Based Multiobjective Optimization , 2019, IEEE Transactions on Evolutionary Computation.

[23]  Stuart McDonald SQL Injection: Modes of attack, defence, and why it matters , 2002 .

[24]  Franciszek Seredynski,et al.  Recurrent neural networks towards detection of SQL attacks , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[25]  Xin Yao,et al.  Interactive Decomposition Multiobjective Optimization Via Progressively Learned Value Functions , 2018, IEEE Transactions on Fuzzy Systems.

[26]  Sam Kwong,et al.  EVOLVING EXTREME LEARNING MACHINE PARADIGM WITH ADAPTIVE OPERATOR SELECTION AND PARAMETER CONTROL , 2013 .

[27]  Xin Yao,et al.  R-Metric: Evaluating the Performance of Preference-Based Evolutionary Multiobjective Optimization Using Reference Points , 2018, IEEE Transactions on Evolutionary Computation.

[28]  Alessandro Orso,et al.  WASP: Protecting Web Applications Using Positive Tainting and Syntax-Aware Evaluation , 2008, IEEE Transactions on Software Engineering.

[29]  Ke Li,et al.  Visualisation of Pareto Front Approximation: A Short Survey and Empirical Comparisons , 2019, 2019 IEEE Congress on Evolutionary Computation (CEC).

[30]  Qingfu Zhang,et al.  Interrelationship-Based Selection for Decomposition Multiobjective Optimization , 2015, IEEE Transactions on Cybernetics.

[31]  Mark Curphey,et al.  Web application security assessment tools , 2006, IEEE Security & Privacy.

[32]  SQL Injection Signatures Evasion , 2004 .

[33]  Robert L. Mercer,et al.  An Estimate of an Upper Bound for the Entropy of English , 1992, CL.

[34]  Xin Yao,et al.  Two-Archive Evolutionary Algorithm for Constrained Multiobjective Optimization , 2017, IEEE Transactions on Evolutionary Computation.

[35]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[36]  Mao Chenyu,et al.  Defending SQL injection attacks based-on intention-oriented detection , 2016, 2016 11th International Conference on Computer Science & Education (ICCSE).

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Sam Kwong,et al.  AN indicator-based selection multi-objective evolutionary algorithm with preference for multi-class ensemble , 2014, 2014 International Conference on Machine Learning and Cybernetics.

[39]  Annibale Panichella,et al.  A Machine-Learning-Driven Evolutionary Approach for Testing Web Application Firewalls , 2018, IEEE Transactions on Reliability.

[40]  Qingfu Zhang,et al.  Efficient Nondomination Level Update Method for Steady-State Evolutionary Multiobjective Optimization , 2017, IEEE Transactions on Cybernetics.

[41]  Kay Chen Tan,et al.  Which Surrogate Works for Empirical Performance Modelling? A Case Study with Differential Evolution , 2019, 2019 IEEE Congress on Evolutionary Computation (CEC).

[42]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[43]  Cong Zhou,et al.  An Improved Differential Evolution for Multi-objective Optimization , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[44]  Naghmeh Moradpoor Sheykhkanloo A Learning-based Neural Network Model for the Detection and Classification of SQL Injection Attacks , 2017, Int. J. Cyber Warf. Terror..

[45]  Alessandro Orso,et al.  AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks , 2005, ASE.

[46]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[47]  Geyong Min,et al.  A Formal Model for Multi-objective Optimisation of Network Function Virtualisation Placement , 2019, EMO.

[48]  Xin Yao,et al.  FEMOSAA , 2016, ACM Trans. Softw. Eng. Methodol..

[49]  Alessandro Orso,et al.  Penetration Testing with Improved Input Vector Identification , 2009, 2009 International Conference on Software Testing Verification and Validation.

[50]  Lionel C. Briand,et al.  Assessing the Impact of Firewalls and Database Proxies on SQL Injection Testing , 2013, FITTEST@ICTSS.

[51]  Kim-Fung Man,et al.  Learning paradigm based on jumping genes: A general framework for enhancing exploration in evolutionary multiobjective optimization , 2013, Inf. Sci..

[52]  Dong Hoon Lee,et al.  Data-mining based SQL injection attack detection using internal query trees , 2014, Expert Syst. Appl..

[53]  Michael D. Ernst,et al.  Automatic creation of SQL Injection and cross-site scripting attacks , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[54]  Shuang Xu,et al.  Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[55]  Shengxiang Yang,et al.  A knee-point-based evolutionary algorithm using weighted subpopulation for many-objective optimization , 2019, Swarm Evol. Comput..

[56]  Eran Yahav,et al.  Code completion with statistical language models , 2014, PLDI.

[57]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[58]  Qingfu Zhang,et al.  Evolutionary Many-Objective Optimization Based on Adversarial Decomposition , 2017, IEEE Transactions on Cybernetics.

[59]  Alessandro Orso,et al.  Using positive tainting and syntax-aware evaluation to counter SQL injection attacks , 2006, SIGSOFT '06/FSE-14.

[60]  Chenghong Wang,et al.  Mutation Based SQL Injection Test Cases Generation for the Web Based Application Vulnerability Testing , 2016 .

[61]  Qingfu Zhang,et al.  Adaptive weights generation for decomposition-based multi-objective optimization using Gaussian process regression , 2017, GECCO.

[62]  Lionel C. Briand,et al.  Automated testing for SQL injection vulnerabilities: an input mutation approach , 2014, ISSTA 2014.

[63]  Cong Zhou,et al.  A novel algorithm for non-dominated hypervolume-based multiobjective optimization , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.