Analysis and Prediction of Mandelbugs in an Industrial Software System

Mandelbugs are faults that are triggered by complex conditions, such as interaction with hardware and other software, and timing or ordering of events. These faults are considerably difficult to detect with traditional testing techniques, since it can be challenging to control their complex triggering conditions in a testing environment. Therefore, it is necessary to adopt specific verification and/or fault-tolerance strategies for dealing with them in a cost-effective way. In this paper, we investigate how to predict the location of Mandelbugs in complex software systems, in order to focus V&V activities and fault tolerance mechanisms in those modules where Mandelbugs are most likely present. In the context of an industrial complex software system, we empirically analyze Mandelbugs, and investigate an approach for Mandelbug prediction based on a set of novel software complexity metrics. Results show that Mandelbugs account for a noticeable share of faults, and that the proposed approach can predict Mandelbug-prone modules with greater accuracy than the sole adoption of traditional software metrics.

[1]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[2]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[3]  Yuanyuan Zhou,et al.  Rx: treating bugs as allergies---a safe method to survive software failures , 2005, SOSP '05.

[4]  Benjamin Schleich,et al.  How does testing affect the availability of aging software systems? , 2013, Perform. Evaluation.

[5]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[6]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.

[7]  Thomas Ball,et al.  Finding and Reproducing Heisenbugs in Concurrent Programs , 2008, OSDI.

[8]  Vikram S. Adve,et al.  An empirical study of reported bugs in server software with implications for automated bug diagnosis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[9]  N. Draper,et al.  Applied Regression Analysis: Draper/Applied Regression Analysis , 1998 .

[10]  Jim Gray,et al.  Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[11]  Ram Chillarege Understanding Bohr-Mandel Bugs through ODC Triggers and a Case Study with Empirical Estimations of Their Field Proportion , 2011, 2011 IEEE Third International Workshop on Software Aging and Rejuvenation.

[12]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation , 1993 .

[13]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[14]  Edward N. Adams,et al.  Optimizing Preventive Service of Software Products , 1984, IBM J. Res. Dev..

[15]  Peter M. Chen,et al.  Whither generic recovery from application faults? A fault study using open-source software , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[16]  Yue Jiang,et al.  Fault Prediction using Early Lifecycle Data , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).

[17]  Ayse Basar Bener,et al.  On the relative value of cross-company and within-company data for defect prediction , 2009, Empirical Software Engineering.

[18]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[19]  Bora Caglayan,et al.  Different strokes for different folks: a case study on software metrics for different defect categories , 2011, WETSoM '11.

[20]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[21]  Gerard J. Holzmann,et al.  The Model Checker SPIN , 1997, IEEE Trans. Software Eng..

[22]  Yennun Huang,et al.  Software rejuvenation: analysis, module and applications , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[23]  Kishor S. Trivedi,et al.  An empirical investigation of fault types in space mission system software , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[24]  Ravishankar K. Iyer,et al.  Software Dependability in the Tandem GUARDIAN System , 1995, IEEE Trans. Software Eng..

[25]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[26]  W. Kent Fuchs,et al.  Progressive Retry for Software Failure Recovery in Message-Passing Applications , 1997, IEEE Trans. Computers.

[27]  Cheng Li,et al.  A study of the internal and external effects of concurrency bugs , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[28]  Kishor S. Trivedi,et al.  Fighting bugs: remove, retry, replicate, and rejuvenate , 2007, Computer.

[29]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[30]  Dong Seong Kim,et al.  Recovery from Failures Due to Mandelbugs in IT Systems , 2011, 2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing.

[31]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[32]  Forrest Shull,et al.  The empirical investigation of Perspective-Based Reading , 1995, Empirical Software Engineering.

[33]  Roy A. Maxion,et al.  Comparing anomaly-detection algorithms for keystroke dynamics , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[34]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[35]  Douglas C. Schmidt,et al.  Constructing reliable distributed communication systems with CORBA , 1997, IEEE Commun. Mag..

[36]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[37]  Yuanyuan Zhou,et al.  Learning from mistakes: a comprehensive study on real world concurrency bug characteristics , 2008, ASPLOS.

[38]  Bora Caglayan,et al.  Usage of multiple prediction models based on defect categories , 2010, PROMISE '10.

[39]  Tracy Halla,et al.  A Systematic Review of Fault Prediction Performance in Software Engineering a Systematic Review of Fault Prediction Performance in Software Engineering , 2011 .

[40]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation0 , 1984, CACM.

[41]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..