A Sparse Coding Method for Specification Mining and Error Localization

Formal specifications play a central role in the design, verification, and debugging of systems. We consider the problem of mining specifications from simulation or execution traces of reactive systems with a special focus on digital circuits. We propose a novel sparse coding method that can extract specifications in the form of a set of basis subtraces. For a set of finite subtraces each of length p, we introduce the sparse Boolean basis problem as the decomposition of each subtrace into a Boolean combination of only a small number of basis subtraces of the same dimension. The contributions of this paper are (1) we formally define the sparse Boolean basis problem and propose a graph-based algorithm to solve it; (2) we demonstrate that we can mine useful specifications using our sparse coding method; (3) we show that the computed bases can be used to do simultaneous error localization and error explanation in a setting that is especially applicable to post-silicon debugging.

[1]  Kurt Keutzer,et al.  Coverage Metrics for Functional Validation of Hardware Designs , 2001, IEEE Des. Test Comput..

[2]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[3]  Alan J. Hu,et al.  BackSpace: Formal Analysis for Post-Silicon Debug , 2008, 2008 Formal Methods in Computer-Aided Design.

[4]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[5]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[6]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[7]  Ben Wegbreit,et al.  The synthesis of loop predicates , 1974, CACM.

[8]  Dieter Kratsch,et al.  On Independent Sets and Bicliques in Graphs , 2008, WG.

[9]  Michael D. Ernst,et al.  Selecting Refining and Evaluating Properties for Program Analysis , 2003 .

[10]  Hong Wang,et al.  BLoG: Post-Silicon bug localization in processors using bug localization graphs , 2010, Design Automation Conference.

[11]  Veronique Froidure Rangs des relations binaires et semigroupes de relations non ambigus , 1995 .

[12]  Raymond T. Yeh,et al.  Proceedings of the international conference on Reliable software , 1975 .

[13]  Gregg Rothermel,et al.  An empirical investigation of the relationship between spectra differences and regression faults , 2000 .

[14]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[15]  Sanjit A. Seshia,et al.  Scalable specification mining for verification and diagnosis , 2010, Design Automation Conference.

[16]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[17]  Michel Caplain,et al.  Finding Invariant assertions for proving programs , 1975, Reliable Software.

[18]  Alex Groce,et al.  SPECIAL S ECTION O N T OOLS A ND A LGORITHMS F OR THE C ONSTRUCTION A ND A NALYSIS O F S YSTEMS , 2005 .

[19]  Pavol Cerný,et al.  Synthesis of interface specifications for Java classes , 2005, POPL '05.

[20]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[21]  Mayur Naik,et al.  From symptom to cause: localizing errors in counterexample traces , 2003, POPL '03.

[22]  Michel Caplain Finding Invariant assertions for proving programs , 1975 .

[23]  René Peeters,et al.  The maximum edge biclique problem is NP-complete , 2003, Discret. Appl. Math..

[24]  Michael D. Ernst Static and dynamic analysis: synergy and duality , 2003 .

[25]  Subhasish Mitra,et al.  IFRA: Instruction Footprint Recording and Analysis for post-silicon bug localization in processors , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[26]  Zhendong Su,et al.  Javert: fully automatic mining of general temporal properties from dynamic traces , 2008, SIGSOFT '08/FSE-16.

[27]  Sanjit A. Seshia,et al.  Post-silicon validation opportunities, challenges and recent advances , 2010, Design Automation Conference.

[28]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[29]  Sriram Sankaranarayanan,et al.  Mining library specifications using inductive logic programming , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[30]  Michael D. Ernst,et al.  Selecting , Refining , and Evaluating Predicates for Program Analysis , 2003 .

[31]  Alex Groce,et al.  Error explanation with distance metrics , 2004, International Journal on Software Tools for Technology Transfer.

[32]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[33]  Peter L. Hammer,et al.  Consensus algorithms for the generation of all maximal bicliques , 2004, Discret. Appl. Math..