Javert: fully automatic mining of general temporal properties from dynamic traces

Program specifications are important for many tasks during software design, development, and maintenance. Among these, temporal specifications are particularly useful. They express formal correctness requirements of an application's ordering of specific actions and events during execution, such as the strict alternation of acquisition and release of locks. Despite their importance, temporal specifications are often missing, incomplete, or described only informally. Many techniques have been proposed that mine such specifications from execution traces or program source code. However, existing techniques mine only simple patterns, or they mine a single complex pattern that is restricted to a particular set of manually selected events. There is no practical, automatic technique that can mine general temporal properties from execution traces. In this paper, we present Javert, the first general specification mining framework that can learn, fully automatically, complex temporal properties from execution traces. The key insight behind Javert is that real, complex specifications can be formed by composing instances of small generic patterns, such as the alternating pattern ((ab)) and the resource usage pattern ((ab c)). In particular, Javert learns simple generic patterns and composes them using sound rules to construct large, complex specifications. We have implemented the algorithm in a practical tool and conducted an extensive empirical evaluation on several open source software projects. Our results are promising; they show that Javert is scalable, general, and precise. It discovered many interesting, nontrivial specifications in real-world code that are beyond the reach of existing automatic techniques.

[1]  Zhendong Su,et al.  Symbolic mining of temporal specifications , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[2]  Suresh Jagannathan,et al.  Static specification inference using predicate mining , 2007, PLDI '07.

[3]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[4]  Sorin Lerner Path-Sensitive Program Veri cation in Polynomial Time , 2002 .

[5]  Sorin Lerner,et al.  ESP: path-sensitive program verification in polynomial time , 2002, PLDI '02.

[6]  Andreas Zeller,et al.  Mining object behavior with ADABU , 2006, WODA '06.

[7]  Andreas Zeller,et al.  Detecting object usage anomalies , 2007, ESEC-FSE '07.

[8]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[9]  James R. Larus,et al.  Debugging temporal specifications with concept analysis , 2003, PLDI '03.

[10]  Thomas A. Henzinger,et al.  Permissive interfaces , 2005, ESEC/FSE-13.

[11]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within and polynomial , 1989, STOC '89.

[12]  William G. Griswold,et al.  Quickly detecting relevant program invariants , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[13]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[14]  K WarmuthManfred,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993 .

[15]  Monica S. Lam,et al.  Automatic extraction of object-oriented component interfaces , 2002, ISSTA '02.

[16]  Alexander Aiken,et al.  Scalable error detection using boolean satisfiability , 2005, POPL '05.

[17]  Joseph Robert Horgan,et al.  Dynamic program slicing , 1990, PLDI '90.

[18]  Xiao Ma,et al.  MUVI: automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs , 2007, SOSP.

[19]  Jian Pei,et al.  Mining API patterns as partial orders from source code: from usage scenarios to specifications , 2007, ESEC-FSE '07.

[20]  Suresh Jagannathan,et al.  Path-Sensitive Inference of Function Precedence Protocols , 2007, 29th International Conference on Software Engineering (ICSE'07).

[21]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[22]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[23]  Siau-Cheng Khoo,et al.  SMArTIC: towards building an accurate, robust and scalable specification miner , 2006, SIGSOFT '06/FSE-14.

[24]  Dawson R. Engler,et al.  From uncertainty to belief: inferring the specification within , 2006, OSDI '06.

[25]  Eran Yahav,et al.  Static Specification Mining Using Automata-Based Abstractions , 2008, IEEE Trans. Software Eng..

[26]  Pavol Cerný,et al.  Synthesis of interface specifications for Java classes , 2005, POPL '05.

[27]  Yannis Smaragdakis,et al.  Dynamically discovering likely interface invariants , 2006, ICSE '06.

[28]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[29]  Leonardo Mariani,et al.  Automatic generation of software behavioral models , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[30]  Sriram K. Rajamani,et al.  Automatically validating temporal safety properties of interfaces , 2001, SPIN '01.

[31]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[32]  Michael D. Ernst,et al.  Automatic generation of program specifications , 2002, ISSTA '02.

[33]  M. Lam,et al.  Tracking down software bugs using automatic anomaly detection , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[34]  Mark Lillibridge,et al.  Extended static checking for Java , 2002, PLDI '02.