Journal of Software Maintenance and Evolution: Research and Practice Mining Temporal Rules for Software Maintenance

Software evolution incurs difficulties in program comprehension and software verification, and hence it increases the cost of software maintenance. In this study, we propose a novel technique to mine from program execution traces a sound and complete set of statistically significant temporal rules of arbitrary lengths. The extracted temporal rules reveal invariants that the program observes, and will consequently guide developers to understand the program behaviors, and facilitate all downstream applications such as verification and debugging. Different from previous studies that were restricted to mining two-event rules (e.g., 〈lock〉→〈unlock〉), our algorithm discovers rules of arbitrary lengths. In order to facilitate downstream applications, we represent the mined rules as temporal logic expressions, so that existing model checkers or other formal analysis toolkit can readily consume our mining results. Performance studies on benchmark data sets and a case study on an industrial system have been performed to show the scalability and utility of our approach. We performed case studies on JBoss application server and a buggy concurrent versions system application, and the result clearly demonstrates the usefulness of our technique in recovering underlying program designs and detecting bugs. Copyright © 2008 John Wiley & Sons, Ltd.

[1]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[2]  Siau-Cheng Khoo,et al.  SMArTIC: towards building an accurate, robust and scalable specification miner , 2006, SIGSOFT '06/FSE-14.

[3]  Toon Calders,et al.  Applying Webmining techniques to execution traces to support the program comprehension process , 2005, Ninth European Conference on Software Maintenance and Reengineering.

[4]  Adrian Kuhn,et al.  Exploiting the Analogy Between Traces and Signal Processing , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[5]  Stephan Merz,et al.  Model Checking , 2000 .

[6]  Chao Liu,et al.  Efficient mining of iterative patterns for software specification discovery , 2007, KDD '07.

[7]  Siau-Cheng Khoo,et al.  QUARK: Empirical Assessment of Automaton-based Specification Miners , 2006, 2006 13th Working Conference on Reverse Engineering.

[8]  James C. Corbett,et al.  Bandera: extracting finite-state models from Java source code , 2000, ICSE.

[9]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[10]  Abdelwahab Hamou-Lhadj,et al.  Summarizing the Content of Large Traces to Facilitate the Understanding of the Behaviour of a Software System , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[11]  Carla E. Brodley,et al.  KDD-Cup 2000 organizers' report: peeling the onion , 2000, SKDD.

[12]  Mohamed E. Fayad Software Maintenance , 2005, IEEE Softw..

[13]  Chao Liu,et al.  Mining Recurrent Rules from Sequence Database , 2007 .

[14]  James R. Larus,et al.  EEL: machine-independent executable editing , 1995, PLDI '95.

[15]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[16]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[17]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[18]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[19]  Yuri Gurevich,et al.  Logic in Computer Science , 1993, Current Trends in Theoretical Computer Science.

[20]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[21]  Thomas Ball,et al.  The concept of dynamic analysis , 1999, ESEC/FSE-7.

[22]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[23]  William G. Griswold,et al.  Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[24]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[25]  George S. Avrunin,et al.  Patterns in property specifications for finite-state verification , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[26]  Andrew Hinton,et al.  PRISM: A Tool for Automatic Verification of Probabilistic Systems , 2006, TACAS.

[27]  Steven P. Reiss,et al.  Encoding program executions , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[28]  Mark Ryan,et al.  Logic in Computer Science: Modelling and Reasoning about Systems , 2000 .

[29]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.