Python predictive analysis for bug detection

Python is a popular dynamic language that allows quick software development. However, Python program analysis engines are largely lacking. In this paper, we present a Python predictive analysis. It first collects the trace of an execution, and then encodes the trace and unexecuted branches to symbolic constraints. Symbolic variables are introduced to denote input values, their dynamic types, and attribute sets, to reason about their variations. Solving the constraints identifies bugs and their triggering inputs. Our evaluation shows that the technique is highly effective in analyzing real-world complex programs with a lot of dynamic features and external library calls, due to its sophisticated encoding design based on traces. It identifies 46 bugs from 11 real-world projects, with 16 new bugs. All reported bugs are true positives.

[1]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[2]  Grigore Rosu,et al.  Maximal sound predictive race detection with control flow abstraction , 2014, PLDI.

[3]  Avik Chaudhuri,et al.  Static Typing for Ruby on Rails , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[4]  Avik Chaudhuri,et al.  Dynamic inference of static types for ruby , 2011, POPL '11.

[5]  Xiangyu Zhang,et al.  Path sensitive static analysis of web applications for remote code execution vulnerability detection , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[6]  Francesco Sorrentino,et al.  PENELOPE: weaving threads to expose atomicity violations , 2010, FSE '10.

[7]  Paolo Tonella,et al.  Symbolic search-based testing , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[8]  Viktor Kuncak,et al.  Runtime Instrumentation for Precise Flow-Sensitive Type Analysis , 2010, RV.

[9]  Thomas Ball,et al.  Deconstructing Dynamic Symbolic Execution , 2015, Dependable Software Systems Engineering.

[10]  Chao Wang,et al.  Symbolic predictive analysis for concurrent programs , 2009, Formal Aspects of Computing.

[11]  Francesco Sorrentino,et al.  Predicting null-pointer dereferences in concurrent programs , 2012, SIGSOFT FSE.

[12]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[13]  Qi Gao,et al.  The HipHop compiler for PHP , 2012, OOPSLA '12.

[14]  Yanhong A. Liu,et al.  Alias analysis for optimization of dynamic languages , 2010, DLS '10.

[15]  Koushik Sen,et al.  TypeDevil: Dynamic Type Inconsistency Analysis for JavaScript , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[16]  Chao Wang,et al.  Generating Data Race Witnesses by an SMT-Based Analysis , 2011, NASA Formal Methods.

[17]  Samuele Pedroni,et al.  PyPy's approach to virtual machine construction , 2006, OOPSLA '06.

[18]  Avik Chaudhuri,et al.  Symbolic security analysis of ruby-on-rails web applications , 2010, CCS '10.

[19]  Matthew B. Dwyer,et al.  Safely reducing the cost of unit level symbolic execution through read/write analysis , 2014, SOEN.

[20]  Manu Sridharan,et al.  DLint: dynamically checking bad coding practices in JavaScript , 2015, ISSTA.

[21]  Corina S. Pasareanu,et al.  JPF-SE: A Symbolic Execution Extension to Java PathFinder , 2007, TACAS.

[22]  Frank Tip,et al.  A framework for automated testing of javascript web applications , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[23]  Nikolai Tillmann,et al.  Fitness-guided path exploration in dynamic symbolic execution , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[24]  Koushik Sen,et al.  Jalangi: a selective record-replay and dynamic analysis framework for JavaScript , 2013, ESEC/FSE 2013.

[25]  Michael Salib,et al.  Starkiller: A Static Type Inferencer and Compiler for Python , 2004 .

[26]  George Candea,et al.  Prototyping symbolic execution engines for interpreted languages , 2014, ASPLOS.

[27]  Yannis Smaragdakis,et al.  Sound predictive race detection in polynomial time , 2012, POPL '12.