Inferring Concise Specifications of APIs

Modern software relies on libraries and uses them via application programming interfaces (APIs). Correct API usage as well as many software engineering tasks are enabled when APIs have formal specifications. In this work, we analyze the implementation of each method in an API to infer a formal postcondition. Conventional wisdom is that, if one has preconditions, then one can use the strongest postcondition predicate transformer (SP) to infer postconditions. However, SP yields postconditions that are exponentially large, which makes them difficult to use, either by humans or by tools. Our key idea is an algorithm that converts such exponentially large specifications into a form that is more concise and thus more usable. This is done by leveraging the structure of the specifications that result from the use of SP. We applied our technique to infer postconditions for over 2,300 methods in seven popular Java libraries. Our technique was able to infer specifications for 75.7% of these methods, each of which was verified using an Extended Static Checker. We also found that 84.6% of resulting specifications were less than 1/4 page (20 lines) in length. Our technique was able to reduce the length of SMT proofs needed for verifying implementations by 76.7% and reduced prover execution time by 26.7%.

[1]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[2]  Gary T. Leavens,et al.  Lessons from the JML Project , 2005, VSTTE.

[3]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[4]  Matthew B. Dwyer,et al.  Bandera: extracting finite-state models from Java source code , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[5]  Andreas Zeller,et al.  Detecting object usage anomalies , 2007, ESEC-FSE '07.

[6]  John Hatcliff,et al.  Kiasan: A Verification and Test-Case Generation Framework for Java Based on Symbolic Execution , 2006, Second International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (isola 2006).

[7]  R. Reiter,et al.  - 1-On the Frame Problem in Procedure Specifications , 1993 .

[8]  Suresh Jagannathan,et al.  Static specification inference using predicate mining , 2007, PLDI '07.

[9]  K. Rustan M. Leino,et al.  This is Boogie 2 , 2016 .

[10]  Paolo Tonella,et al.  Automated oracles: an empirical study on cost and effectiveness , 2013, ESEC/FSE 2013.

[11]  Hoan Anh Nguyen,et al.  Graph-based mining of multiple object usage patterns , 2009, ESEC/FSE '09.

[12]  Bertrand Meyer,et al.  Inferring better contracts , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[13]  David Lo,et al.  Mining Hierarchical Scenario-Based Specifications , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[14]  Patrick Cousot,et al.  Automatic Inference of Necessary Preconditions , 2013, VMCAI.

[15]  K. Rustan M. Leino,et al.  Houdini, an Annotation Assistant for ESC/Java , 2001, FME.

[16]  Albert L. Baker,et al.  Preliminary design of JML: a behavioral interface specification language for java , 2006, SOEN.

[17]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[18]  Andreas Zeller,et al.  Learning from 6,000 projects: lightweight cross-project anomaly detection , 2010, ISSTA '10.

[19]  Michael D. Ernst,et al.  Improving test suites via operational abstraction , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[20]  Cormac Flanagan,et al.  Avoiding exponential explosion: generating compact verification conditions , 2001, POPL '01.

[21]  Aws Albarghouthi,et al.  Discovering relational specifications , 2017, ESEC/SIGSOFT FSE.

[22]  Koushik Sen DART: Directed Automated Random Testing , 2009, Haifa Verification Conference.

[23]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[24]  Thomas R. Gross,et al.  Automatic Generation of Object Usage Specifications from Large Method Traces , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[25]  Michael D. Ernst,et al.  Invariant inference for static checking: an empirical evaluation , 2002, SOEN.

[26]  Swarat Chaudhuri,et al.  Bayesian specification learning for finding API usage errors , 2017, ESEC/SIGSOFT FSE.

[27]  Chadd C. Williams,et al.  Automatic mining of source code repositories to improve bug finding techniques , 2005, IEEE Transactions on Software Engineering.

[28]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[29]  Patrick Cousot,et al.  Precondition Inference from Intermittent Assertions and Application to Contracts on Collections , 2011, VMCAI.

[30]  Hridesh Rajan,et al.  Mining preconditions of APIs in large-scale code corpus , 2014, FSE 2014.

[31]  Isil Dillig,et al.  Inductive invariant generation via abductive inference , 2013, OOPSLA.

[32]  Michael J. C. Gordon,et al.  Forward with Hoare , 2010, Reflections on the Work of C. A. R. Hoare.

[33]  Michael D. Ernst,et al.  An overview of JML tools and applications , 2003, Electron. Notes Theor. Comput. Sci..

[34]  Mark Lillibridge,et al.  Extended static checking for Java , 2002, PLDI '02.

[35]  Radu Grigore,et al.  Strongest postcondition of unstructured programs , 2009, FTfJP@ECOOP.

[36]  簡聰富,et al.  物件導向軟體之架構(Object-Oriented Software Construction)探討 , 1989 .

[37]  Andreas Zeller,et al.  Lightweight Defect Localization for Java , 2005, ECOOP.

[38]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[39]  Alexander Aiken,et al.  Scalable error detection using boolean satisfiability , 2005, POPL '05.

[40]  Westley Weimer,et al.  Automatic documentation inference for exceptions , 2008, ISSTA '08.

[41]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[42]  Zhiyi Ma,et al.  Detecting Duplications in Sequence Diagrams Based on Suffix Trees , 2006, 2006 13th Asia Pacific Software Engineering Conference (APSEC'06).

[43]  Neelam Gupta,et al.  A new structural coverage criterion for dynamic detection of program invariants , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[44]  Sriram K. Rajamani,et al.  Automatically validating temporal safety properties of interfaces , 2001, SPIN '01.

[45]  K. Rustan M. Leino,et al.  Weakest-precondition of unstructured programs , 2005, PASTE '05.

[46]  Zhendong Su,et al.  Javert: fully automatic mining of general temporal properties from dynamic traces , 2008, SIGSOFT '08/FSE-16.

[47]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[48]  Benjamin Livshits,et al.  DynaMine: finding common error patterns by mining software revision histories , 2005, ESEC/FSE-13.

[49]  Amir Michail,et al.  Data mining library reuse patterns using generalized association rules , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.