Applied web traffic analysis for numerical encoding of SQL Injection attack features.

SQL Injection Attack (SQLIA) remains a technique used by a computer network intruder to pilfer an organisation’s confidential data. This is done by an intruder re-crafting web form’s input and query strings used in web requests with malicious intent to compromise the security of an organisation’s confidential data stored at the back-end database. The database is the most valuable data source, and thus, intruders are unrelenting in constantly evolving new techniques to bypass the signature’s solutions currently provided in Web Application Firewalls (WAF) to mitigate SQLIA. There is therefore a need for an automated scalable methodology in the pre-processing of SQLIA features fit for a supervised learning model. However, obtaining a ready-made scalable dataset that is feature engineered with numerical attributes dataset items to train Artificial Neural Network (ANN) and Machine Leaning (ML) models is a known issue in applying artificial intelligence to effectively address ever evolving novel SQLIA signatures. This proposed approach applies numerical attributes encoding ontology to encode features (both legitimate web requests and SQLIA) to numerical data items as to extract scalable dataset for input to a supervised learning model in moving towards a ML SQLIA detection and prevention model. In numerical attributes encoding of features, the proposed model explores a hybrid of static and dynamic pattern matching by implementing a Non-Deterministic Finite Automaton (NFA). This combined with proxy and SQL parser Application Programming Interface (API) to intercept and parse web requests in transition to the back-end database. In developing a solution to address SQLIA, this model allows processed web requests at the proxy deemed to contain injected query string to be excluded from reaching the target back-end database. This paper is intended for evaluating the performance metrics of a dataset obtained by numerical encoding of features ontology in Microsoft Azure Machine Learning (MAML) studio using Two-Class Support Vector Machines (TCSVM) binary classifier. This methodology then forms the subject of the empirical evaluation.

[1]  Lionel C. Briand,et al.  Behind an Application Firewall, Are We Safe from SQL Injection Attacks? , 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST).

[2]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[3]  Dan Wang,et al.  Detecting SQL Vulnerability Attack Based on the Dynamic and Static Analysis Technology , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[4]  Angelos D. Keromytis,et al.  SQLrand: Preventing SQL Injection Attacks , 2004, ACNS.

[5]  Alessandro Orso,et al.  A Classification of SQL Injection Attacks and Countermeasures , 2006, ISSSE.

[6]  V. N. Venkatakrishnan,et al.  CANDID: Dynamic candidate evaluations for automatic prevention of SQL injection attacks , 2010, TSEC.

[7]  Bruce W. Weide,et al.  Using parse tree validation to prevent SQL injection attacks , 2005, SEM '05.

[8]  William J. Buchanan,et al.  Numerical encoding to Tame SQL injection attacks , 2016, NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium.

[9]  Debabrata Kar,et al.  SQLiDDS: SQL Injection Detection Using Query Transformation and Document Similarity , 2015, ICDCIT.

[10]  Premkumar T. Devanbu,et al.  JDBC checker: a static analysis tool for SQL/JDBC applications , 2004, Proceedings. 26th International Conference on Software Engineering.

[11]  Angelos Stavrou,et al.  SQLProb: a proxy-based architecture towards preventing SQL injection attacks , 2009, SAC '09.

[12]  Alessandro Orso,et al.  AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks , 2005, ASE.