A methodology for the generation of efficient error detection mechanisms

A dependable software system must contain error detection mechanisms and error recovery mechanisms. Software components for the detection of errors are typically designed based on a system specification or the experience of software engineers, with their efficiency typically being measured using fault injection and metrics such as coverage and latency. In this paper, we introduce a methodology for the design of highly efficient error detection mechanisms. The proposed methodology combines fault injection analysis and data mining techniques in order to generate predicates for efficient error detection mechanisms. The results presented demonstrate the viability of the methodology as an approach for the development of efficient error detection mechanisms, as the predicates generated yield a true positive rate of almost 100% and a false positive rate very close to 0% for the detection of failure-inducing states. The main advantage of the proposed methodology over current state-of-the-art approaches is that efficient detectors are obtained by design, rather than by using specification-based detector design or the experience of software engineers.

[1]  Arnold P. Boedihardjo,et al.  Exploiting efficient data mining techniques to enhance intrusion detection systems , 2005, IRI -2005 IEEE International Conference on Information Reuse and Integration, Conf, 2005..

[2]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[3]  William G. Griswold,et al.  Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[4]  Nancy G. Leveson,et al.  The Use of Self Checks and Voting in Software Error Detection: An Empirical Study , 1990, IEEE Trans. Software Eng..

[5]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[6]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[7]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[8]  Neeraj Suri,et al.  PROPANE: an environment for examining the propagation of errors in software , 2002, ISSTA '02.

[9]  Hermann Kopetz,et al.  Dependability: Basic Concepts and Terminology , 1992 .

[10]  Neeraj Suri,et al.  An approach to synthesise safe systems , 2006, Int. J. Secur. Networks.

[11]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[12]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[13]  David Powell Failure mode assumptions and assumption coverage , 1992 .

[14]  Edmund M. Clarke,et al.  Model Checking , 1999, Handbook of Automated Reasoning.

[15]  Anish Arora,et al.  Automating the Addition of Fault-Tolerance , 2000, FTRTFT.

[16]  Jean Arlat,et al.  Estimators for Fault Tolerance Coverage Evaluation , 1995, IEEE Trans. Computers.

[17]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[18]  David A. Schmidt Data flow analysis is model checking of abstract interpretations , 1998, POPL '98.

[19]  David A. Cieslak,et al.  Automatically countering imbalance and its empirical relationship to cost , 2008, Data Mining and Knowledge Discovery.

[20]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[21]  Neeraj Suri,et al.  An approach for analysing the propagation of data errors in software , 2001, 2001 International Conference on Dependable Systems and Networks.

[22]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[23]  Arshad Jhumka,et al.  Issues on the Design of Efficient Fail-Safe Fault Tolerance , 2009, 2009 20th International Symposium on Software Reliability Engineering.

[24]  Martin Hiller,et al.  Executable assertions for detecting data errors in embedded control systems , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[25]  Anish Arora,et al.  Detectors and correctors: a theory of fault-tolerance components , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).

[26]  Stephen McCamant,et al.  Inference and enforcement of data structure consistency specifications , 2006, ISSTA '06.

[27]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[28]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[29]  Neeraj Suri,et al.  An approach for designing and assessing detectors for dependable component-based systems , 2004, Eighth IEEE International Symposium on High Assurance Systems Engineering, 2004. Proceedings..

[30]  Johan Karlsson,et al.  Reducing critical failures for control algorithms using executable assertions and best effort recovery , 2001, 2001 International Conference on Dependable Systems and Networks.

[31]  Sarita V. Adve,et al.  Using likely program invariants to detect hardware errors , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[32]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[33]  Nathalie Japkowicz,et al.  The Class Imbalance Problem: Significance and Strategies , 2000 .

[34]  John Langford,et al.  Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.

[35]  Kai Ming Ting,et al.  An Instance-weighting Method to Induce Cost-sensitive Trees , 2001 .

[36]  Ali Ebnenasir,et al.  The complexity of adding failsafe fault-tolerance , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[37]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[38]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[39]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[40]  Marco Vieira,et al.  A Data Mining Approach to Identify Key Factors in Dependability Experiments , 2005, EDCC.