论文信息 - The Use of NLP Techniques in Static Code Analysis to Detect Weaknesses and Vulnerabilities

The Use of NLP Techniques in Static Code Analysis to Detect Weaknesses and Vulnerabilities

We employ classical NLP techniques (n-grams and various smoothing algorithms) combined with machine learning for non-NLP applications of detection, classification, and reporting of weaknesses related to vulnerabilities or bad coding practices found in artificial constrained languages, such as programming languages and their compiled counterparts. We compare and contrast the NLP approach to the signal processing approach in our results summary along with concrete promising results for specific test cases of open-source software written in C, C++, and JAVA. We use the open-source MARF’s NLP framework and its MARFCAT application for the task, where the latter originally was designed for the Static Analysis Tool Exposition (SATE) workshop

Mourad Debbabi | Serguei A. Mokhov | Joey Paquet | M. Debbabi | J. Paquet

[1] Aurelien Delaitre,et al. Report on the Static Analysis Tool Exposition (SATE) IV , 2013 .

[2] Dawson R. Engler,et al. From uncertainty to belief: inferring the specification within , 2006, OSDI '06.

[3] Syrine Tlili. Automatic detection of safety and security vulnerabilities in open source software , 2009 .

[4] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[5] Serguei A. Mokhov. Evolution of MARF and its NLP framework , 2010, C3S2E '10.

[6] Mehran Bozorgi,et al. Beyond heuristics: learning to classify vulnerabilities and predict exploits , 2010, KDD.

[7] Koji Nakao,et al. nicter: a large-scale network incident analysis system: case studies for understanding threat landscape , 2011, BADGERS '11.

[8] Mourad Debbabi,et al. File Type Analysis Using Signal Processing Techniques and Machine Learning vs. File Unix Utility for Forensic Analysis , 2008, IMF.

[9] Yuqing Zhang,et al. Eliminating Human Specification in Static Analysis , 2010, RAID.

[10] Serguei A. Mokhov,et al. L'Approche MARF à DEFT 2010: A MARF Approach to DEFT 2010 , 2010 .

[11] Mourad Debbabi,et al. MARFCAT: Transitioning to Binary and Larger Data Sets of SATE IV , 2012, ArXiv.

[12] Aurelien Delaitre,et al. The Second Static Analysis Tool Exposition (SATE) 2009 , 2010 .