Using Pattern Position Distribution for Software Failure Detection

We present a novel approach for using the pattern position distribution as features to detect software failure. In this approach, we divide an execution sequence into several sections and compute the pattern distribution in each section. The distribution of all patterns is then used as features to train a classifier. This approach outperforms conventional frequency based methods by more effectively identifying software failures occurring through misused software patterns. Comparative experiments show the effectiveness of our approach.

[1]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[2]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[3]  Jian Pei,et al.  A brief survey on sequence classification , 2010, SKDD.

[4]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[5]  David Leon,et al.  Finding failures by cluster analysis of execution profiles , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[6]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[7]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[8]  Gregory Tassey,et al.  Prepared for what , 2007 .

[9]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[10]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[11]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[12]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[13]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[14]  Jiawei Han,et al.  Discriminative Frequent Pattern Analysis for Effective Classification , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[15]  Siau-Cheng Khoo,et al.  QUARK: Empirical Assessment of Automaton-based Specification Miners , 2006, 2006 13th Working Conference on Reverse Engineering.

[16]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[17]  Chao Liu,et al.  Efficient mining of iterative patterns for software specification discovery , 2007, KDD '07.

[18]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[19]  Jiawei Han,et al.  Classification of software behaviors for failure detection: a discriminative pattern mining approach , 2009, KDD.

[20]  M. G. Rekoff,et al.  On reverse engineering , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[21]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[22]  Eugene Miya,et al.  On "Software engineering" , 1985, SOEN.

[23]  Pascal Felber,et al.  Scalable Distribution of XML Content with XNet , 2008, IEEE Transactions on Parallel and Distributed Systems.

[24]  Chao Liu,et al.  Mining Behavior Graphs for "Backtrace" of Noncrashing Bugs , 2005, SDM.

[25]  Vipin Kumar,et al.  Anomaly Detection for Discrete Sequences: A Survey , 2012, IEEE Transactions on Knowledge and Data Engineering.

[26]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[27]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[28]  James M. Rehg,et al.  Active learning for automatic classification of software behavior , 2004, ISSTA '04.

[29]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.