Source code profiling and classification for automated detection of logical errors

Research and industrial experience reveal that code reviews as a part of software inspection might be the most cost-effective technique a team can use to reduce defects. Tools that automate code inspection mostly focus on the detection of a priori known defect patterns and security vulnerabilities. Automated detection of logical errors, due to a faulty implementation of applications’ functionality is a relatively uncharted territory. Automation can be based on profiling the intended behavior behind the source code. In this paper, we present a code profiling method based on token classification. Our method combines an information flow analysis, the crosschecking of dynamic invariants with symbolic execution, and code classification heuristics with the use of a fuzzy logic system. Our goal is to detect logical errors and exploitable vulnerabilities. The theoretical underpinnings and the practical implementation of our approach are discussed. We test the APP_LogGIC tool that implements the proposed analysis on two real-world applications. The results show that profiling the intended program behavior is feasible in diverse applications. We discuss in detail the heuristics used to overcome the problem of state space explosion and that of the large data sets. Code metrics and test results are provided to demonstrate the effectiveness of the proposed approach. This paper extends the work that appears in an article currently submitted to an international conference with proceedings. In this adequately extended version of our method we present classification mechanisms that can take into account multiple user input and provide a detailed description of the used source code classification techniques and heuristics.

[1]  Brett McLaughlin Building Java Enterprise Applications, Vol. 1: Architecture (O'Reilly Java) , 2002 .

[2]  Gerald Albaum,et al.  The Likert Scale Revisited , 1997 .

[3]  M. Theoharidou,et al.  Risk assessment of multi-order dependencies between critical ICT infrastructures , 2014 .

[4]  James H. Dobbins Inspections as an up-front quality technique , 1998 .

[5]  Christopher Krügel,et al.  Toward Automated Detection of Logic Vulnerabilities in Web Applications , 2010, USENIX Security Symposium.

[6]  Panayiotis Kotzanikolaou,et al.  Risk-Based Criticality Analysis , 2009, Critical Infrastructure Protection.

[7]  Giovanni Vigna,et al.  Multi-module vulnerability analysis of web-based applications , 2007, CCS '07.

[8]  Mitsuhiro Kimura Software vulnerability: Definition, modelling, and practical evaluation for e-mail transfer software , 2006 .

[9]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[10]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[11]  Panayiotis Kotzanikolaou,et al.  Assessing n-order dependencies between critical infrastructures , 2013, Int. J. Crit. Infrastructures.

[12]  Christopher Krügel,et al.  Fear the EAR: discovering and mitigating execution after redirect vulnerabilities , 2011, CCS '11.

[13]  A. Zeller Isolating cause-effect chains from computer programs , 2002, SIGSOFT '02/FSE-10.

[14]  Guy L. Steele,et al.  The Java Language Specification, Java SE 8 Edition , 2013 .

[15]  Etienne E. Kerre,et al.  Defuzzification: criteria and classification , 1999, Fuzzy Sets Syst..

[16]  Corina S. Pasareanu,et al.  Assume-guarantee verification of source code with design-level assumptions , 2004, Proceedings. 26th International Conference on Software Engineering.

[17]  Panayiotis Kotzanikolaou,et al.  A multi-layer Criticality Assessment methodology based on interdependencies , 2010, Comput. Secur..

[18]  Domenico Cotroneo,et al.  On Fault Representativeness of Software Fault Injection , 2013, IEEE Transactions on Software Engineering.

[19]  Alastair F. Donaldson,et al.  Software Model Checking , 2014, Computing Handbook, 3rd ed..

[20]  Dimitris Gritzalis,et al.  Hunting Application-Level Logical Errors , 2012, ESSoS.

[21]  Dimitris Gritzalis,et al.  On Business Logic Vulnerabilities Hunting: The APP_LogGIC Framework , 2013, NSS.

[22]  Xiangyu Zhang,et al.  Locating faults through automated predicate switching , 2006, ICSE.

[23]  Hao Wang,et al.  Towards automatic generation of vulnerability-based signatures , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[24]  Corina S. Pasareanu,et al.  Verification of Java Programs Using Symbolic Execution and Invariant Generation , 2004, SPIN.

[25]  George K. Baah Statistical causal analysis for fault localization , 2012 .

[26]  Xiangyu Zhang,et al.  Pruning dynamic slices with confidence , 2006, PLDI '06.

[27]  Jesús Alcalá-Fdez,et al.  jFuzzyLogic: a robust and flexible Fuzzy-Logic inference system language implementation , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[28]  Sarah Eichmann The Role Of Risk Management Guide For Information Technology Systems , 2016 .

[29]  Panayiotis Kotzanikolaou,et al.  Cascading Effects of Common-Cause Failures in Critical Infrastructures , 2013, Critical Infrastructure Protection.

[30]  Dolores R. Wallace,et al.  Software Error Analysis , 1995 .

[31]  Panayiotis Kotzanikolaou,et al.  Risk assessment methodology for interdependent critical infrastructures , 2011 .

[32]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[33]  Paul E. Black,et al.  Juliet 1.1 C/C++ and Java Test Suite , 2012, Computer.