Static Specification Mining Using Automata-Based Abstractions

We present a novel approach to client-side mining of temporal API specifications based on static analysis. Specifically, we present an interprocedural analysis over a combined domain that abstracts both aliasing and event sequences for individual objects. The analysis uses a new family of automata-based abstractions to represent unbounded event sequences, designed to disambiguate distinct usage patterns and merge similar usage patterns. Additionally, our approach includes an algorithm that summarizes abstract traces based on automata clusters, and effectively rules out spurious behaviors. We show experimental results mining specifications from a number of Java clients and APIs. The results indicate that effective static analysis for client-side mining requires fairly precise treatment of aliasing and abstract event sequences. Based on the results, we conclude that static client-side specification mining shows promise as a complement or alternative to dynamic approaches.

[1]  Sudheendra Hangal,et al.  Tracking down software bugs using automatic anomaly detection , 2002, ICSE '02.

[2]  Pavol Cerný,et al.  Synthesis of interface specifications for Java classes , 2005, POPL '05.

[3]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[4]  Andy Chou,et al.  A simple method for extracting models from protocol code , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[5]  Viktor Kuncak,et al.  Role analysis , 2002, POPL '02.

[6]  Marco Pistoia,et al.  JAVA 2 Network Security , 1999 .

[7]  Martin C. Rinard,et al.  Purity and Side Effect Analysis for Java Programs , 2005, VMCAI.

[8]  Benjamin Livshits,et al.  DynaMine: finding common error patterns by mining software revision histories , 2005, ESEC/FSE-13.

[9]  Mangala Gowri Nanda,et al.  Deriving object typestates in the presence of inter-object references , 2005, OOPSLA '05.

[10]  Patrick Lam,et al.  A Type System and Analysis for the Automatic Extraction and Enforcement of Design Information , 2003, ECOOP.

[11]  Rastislav Bodík,et al.  Jungloid mining: helping to navigate the API jungle , 2005, PLDI '05.

[12]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[13]  Andreas Zeller,et al.  Detecting object usage anomalies , 2007, ESEC-FSE '07.

[14]  Mark N. Wegman,et al.  Analysis of pointers and structures , 1990, SIGP.

[15]  Eran Yahav,et al.  Generating precise and concise procedure summaries , 2008, POPL '08.

[16]  Eran Yahav,et al.  Effective typestate verification in the presence of aliasing , 2006, TSEM.

[17]  Martin C. Rinard,et al.  Role-based exploration of object-oriented programs , 2002, ICSE '02.

[18]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[19]  Monica S. Lam,et al.  Automatic extraction of object-oriented component interfaces , 2002, ISSTA '02.

[20]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[21]  Andreas Zeller,et al.  Mining object behavior with ADABU , 2006, WODA '06.

[22]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[23]  William G. Griswold,et al.  Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[24]  Siau-Cheng Khoo,et al.  SMArTIC: towards building an accurate, robust and scalable specification miner , 2006, SIGSOFT '06/FSE-14.

[25]  Leonardo Mariani,et al.  Dynamic Detection of COTS Component Incompatibility , 2007, IEEE Software.

[26]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[27]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[28]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[29]  Leonardo Mariani,et al.  Towards Self-Protecting Enterprise Applications , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).

[30]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.