Scenario-based and value-based specification mining: better together

Specification mining takes execution traces as input and extracts likely program invariants, which can be used for comprehension, verification, and evolution related tasks. In this work we integrate scenario-based specification mining, which uses a data-mining algorithm to suggest ordering constraints in the form of live sequence charts, an inter-object, visual, modal, scenario-based specification language, with mining of value-based invariants, which detects likely invariants holding at specific program points. The key to the integration is a technique we call scenario-based slicing, running on top of the mining algorithms to distinguish the scenario-specific invariants from the general ones. The resulting suggested specifications are rich, consisting of modal scenarios annotated with scenario-specific value-based invariants, referring to event parameters and participating object properties.We have implemented the mining algorithm and the visual presentation of the mined scenarios within a standard development environment. An evaluation of our work over a number of case studies shows promising results in extracting expressive specifications from real programs, which could not be extracted previously. The more expressive the mined specifications, the higher their potential to support program comprehension and testing.

[1]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[2]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[3]  John T. Stasko,et al.  Visualizing Interactions in Program Executions , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[4]  David Lo,et al.  Scenario-based and value-based specification mining: better together , 2010, ASE '10.

[5]  Ingolf Krüger,et al.  Capturing Overlapping, Triggered, and Preemptive Collaborations Using MSCs , 2003, FASE.

[6]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[7]  William G. Griswold,et al.  Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[8]  Siau-Cheng Khoo,et al.  Mining modal scenario-based specifications from execution traces of reactive systems , 2007, ASE '07.

[9]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[10]  Leonardo Mariani,et al.  Compatibility and Regression Testing of COTS-Component-Based Software , 2007, 29th International Conference on Software Engineering (ICSE'07).

[11]  Amir Pnueli,et al.  Temporal Logic for Scenario-Based Specifications , 2005, TACAS.

[12]  George S. Avrunin,et al.  Patterns in property specifications for finite-state verification , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[13]  Leonardo Mariani,et al.  Behavior capture and test: automated analysis of component integration , 2005, 10th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS'05).

[14]  Alessandro Orso,et al.  Test-Suite Augmentation for Evolving Software , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[15]  Sebastián Uchitel,et al.  Detecting implied scenarios in message sequence chart specifications , 2001, ESEC/FSE-9.

[16]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[17]  Suresh Jagannathan,et al.  Static specification inference using predicate mining , 2007, PLDI '07.

[18]  Benjamin Livshits,et al.  Merlin: specification inference for explicit information flow problems , 2009, PLDI '09.

[19]  Leon J. Osterweil,et al.  Cecil: A Sequencing Constraint Language for Automatic Static Analysis Generation , 1990, IEEE Trans. Software Eng..

[20]  Tao Xie,et al.  Mining Interface Specifications for Generating Checkable Robustness Properties , 2006, 2006 17th International Symposium on Software Reliability Engineering.

[21]  Eleni Stroulia,et al.  From run-time behavior to usage scenarios: an interaction-pattern mining approach , 2002, KDD.

[22]  Claire Le Goues,et al.  Specification Mining with Few False Positives , 2009, TACAS.

[23]  David Harel,et al.  From multi-modal scenarios to code: compiling LSCs into aspectJ , 2006, SIGSOFT '06/FSE-14.

[24]  Sebastián Uchitel,et al.  Existential live sequence charts revisited , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[25]  Jian Pei,et al.  Mining frequent patterns by pattern-growth: methodology and implications , 2000, SKDD.

[26]  Leonardo Mariani,et al.  Automatic generation of software behavioral models , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[27]  David Harel,et al.  S2A: A Compiler for Multi-modal UML Sequence Diagrams , 2007, FASE.

[28]  David Harel,et al.  From Play-In Scenarios to Code: An Achievable Dream , 2000, Computer.

[29]  Jian Pei,et al.  Minimum Description Length Principle: Generators Are Preferable to Closed Patterns , 2006, AAAI.

[30]  Andreas Zeller,et al.  Mining object behavior with ADABU , 2006, WODA '06.

[31]  Myra B. Cohen,et al.  Factors affecting the use of genetic algorithms in test suite augmentation , 2010, GECCO '10.

[32]  Shriram Krishnamurthi,et al.  Automated Fault Localization Using Potential Invariants , 2003, ArXiv.

[33]  Marat Boshernitsan,et al.  From daikon to agitator: lessons and challenges in building a commercial tool for developer testing , 2006, ISSTA '06.

[34]  David Lo,et al.  Towards Succinctness in Mining Scenario-Based Specifications , 2011, 2011 16th IEEE International Conference on Engineering of Complex Computer Systems.

[35]  Hillel Kugler,et al.  Compositional Synthesis of Reactive Systems from Live Sequence Chart Specifications , 2009, TACAS.

[36]  Jinyan Li,et al.  Mining and Ranking Generators of Sequential Patterns , 2008, SDM.

[37]  Holger Giese,et al.  Joint Structural and Temporal Property Specification Using Timed Story Scenario Diagrams , 2007, FASE.

[38]  David Lo,et al.  Specification mining of symbolic scenario-based models , 2008, PASTE '08.

[39]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[40]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[41]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[42]  Jian Pei,et al.  Mining API patterns as partial orders from source code: from usage scenarios to specifications , 2007, ESEC-FSE '07.

[43]  Bernd Westphal,et al.  Check It Out: On the Efficient Formal Verification of Live Sequence Charts , 2006, CAV.

[44]  Eran Yahav,et al.  Static Specification Mining Using Automata-Based Abstractions , 2007, IEEE Transactions on Software Engineering.

[45]  David Lo,et al.  Mining Quantified Temporal Rules: Formalism, Algorithms, and Evaluation , 2009, 2009 16th Working Conference on Reverse Engineering.

[46]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[47]  Mika Katara,et al.  Model-Based Testing Using LSCs and S2A , 2009, MoDELS.

[48]  David Harel,et al.  A Compiler for Multimodal Scenarios: Transforming LSCs into AspectJ , 2011, TSEM.

[49]  David Lo,et al.  Mining Scenario-Based Triggers and Effects , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[50]  Stephen McCamant,et al.  Dynamic inference of abstract types , 2006, ISSTA '06.

[51]  UchitelSebastian,et al.  Detecting implied scenarios in message sequence chart specifications , 2001 .

[52]  Siau-Cheng Khoo,et al.  SMArTIC: towards building an accurate, robust and scalable specification miner , 2006, SIGSOFT '06/FSE-14.

[53]  Chao Liu,et al.  Data Mining for Software Engineering , 2009, Computer.

[54]  David Harel,et al.  LSCs: Breathing Life into Message Sequence Charts , 1999, Formal Methods Syst. Des..

[55]  Kamran Sartipi,et al.  Dynamic Analysis of Software Systems using Execution Pattern Mining , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[56]  Jun Sun,et al.  Synthesis of Distributed Processes from Scenario-Based Specifications , 2005, FM.

[57]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[58]  Jochen Klose,et al.  Scenario-Based Monitoring and Testing of Real-Time UML Models , 2001, UML.

[59]  Andreas Zeller,et al.  Mining temporal specifications from object usage , 2011, Automated Software Engineering.

[60]  Suresh Jagannathan,et al.  Path-Sensitive Inference of Function Precedence Protocols , 2007, 29th International Conference on Software Engineering (ICSE'07).

[61]  David Harel,et al.  On tracing reactive systems , 2011, Software & Systems Modeling.

[62]  David Harel,et al.  Assert and negate revisited: Modal semantics for UML sequence diagrams , 2008, SCESM '06.

[63]  Zhendong Su,et al.  Symbolic mining of temporal specifications , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[64]  David Harel,et al.  Come, let's play - scenario-based programming using LSCs and the play-engine , 2003 .

[65]  Siau-Cheng Khoo,et al.  NORT: Runtime Anomaly-Based Monitoring of Malicious Behavior for Windows , 2011, RV.

[66]  David Harel,et al.  PlayGo: towards a comprehensive tool for scenario based programming , 2010, ASE.

[67]  David Lo,et al.  Mining Hierarchical Scenario-Based Specifications , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[68]  Rainer Koschke,et al.  Dynamic Protocol Recovery , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[69]  David Harel,et al.  Assert and negate revisited: modal semantics for UML sequence diagrams , 2006, SCESM.