Predicting software defects with causality tests

Abstract In this paper, we propose a defect prediction approach centered on more robust evidences towards causality between source code metrics (as predictors) and the occurrence of defects. More specifically, we rely on the Granger causality test to evaluate whether past variations in source code metrics values can be used to forecast changes in time series of defects. Our approach triggers alarms when changes made to the source code of a target system have a high chance of producing defects. We evaluated our approach in several life stages of four Java-based systems. We reached an average precision greater than 50% in three out of the four systems we evaluated. Moreover, by comparing our approach with baselines that are not based on causality tests, it achieved a better precision.

[1]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[2]  Ricardo Terra,et al.  Qualitas.class corpus: a compiled version of the qualitas corpus , 2013, SOEN.

[3]  Marco Tulio Valente,et al.  Uncovering Causal Relationships between Software Metrics and Bugs , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[4]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[5]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[6]  Snigdhansu Chatterjee,et al.  Causality and pathway search in microarray time series experiment , 2007, Bioinform..

[7]  Stéphane Ducasse,et al.  Modeling history to analyze software evolution , 2006, J. Softw. Maintenance Res. Pract..

[8]  Andreas Zeller,et al.  HATARI: raising risk awareness , 2005, ESEC/FSE-13.

[9]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[10]  Michele Lanza,et al.  BugCrawler: Visualizing Evolving Software Systems , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[11]  Marco Tulio Valente,et al.  Study on the relevance of the warnings reported by Java bug-finding tools , 2011, IET Softw..

[12]  David Hovemeyer,et al.  Finding bugs is easy , 2004, SIGP.

[13]  Meir M. Lehman,et al.  Program evolution: processes of software change , 1985 .

[14]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[15]  Ricardo Terra,et al.  A dependency constraint language to manage object‐oriented software architectures , 2009, Softw. Pract. Exp..

[16]  Standard Glossary of Software Engineering Terminology , 1990 .

[17]  Richard Wettel Visual exploration of large-scale evolving software , 2009, 2009 31st International Conference on Software Engineering - Companion Volume.

[18]  Stefan Wagner,et al.  An Evaluation of Two Bug Pattern Tools for Java , 2008, 2008 1st International Conference on Software Testing, Verification, and Validation.

[19]  Javam C. Machado,et al.  The prediction of faulty classes using object-oriented design metrics , 2001, J. Syst. Softw..

[20]  Adam A. Porter,et al.  A primer on empirical studies (tutorial) , 1997, ICSE '97.

[21]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[22]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[23]  Marco Tulio Valente,et al.  Bug Maps: A Tool for the Visual Exploration and Analysis of Bugs , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[24]  Watts S. Humphrey,et al.  A discipline for software engineering , 2012, Series in software engineering.

[25]  Marco Tulio Valente,et al.  Static correspondence and correlation between field defects and warnings reported by a bug finding tool , 2011, Software Quality Journal.

[26]  Ricardo Terra,et al.  Recommending Move Method refactorings using dependency sets , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[27]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[28]  Lionel C. Briand,et al.  Investigating quality factors in object-oriented designs: an industrial case study , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[29]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[30]  Oscar Nierstrasz,et al.  The story of moose: an agile reengineering environment , 2005, ESEC/FSE-13.

[31]  Michele Lanza,et al.  The evolution matrix: recovering software evolution using software visualization techniques , 2001, IWPSE '01.

[32]  Michele Lanza,et al.  Distributed and Collaborative Software Evolution Analysis with Churrasco , 2010, Sci. Comput. Program..

[33]  Alberto Bacchelli,et al.  On the Impact of Design Flaws on Software Defects , 2010, 2010 10th International Conference on Quality Software.

[34]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[35]  Laurie A. Williams,et al.  On the value of static analysis for fault detection in software , 2006, IEEE Transactions on Software Engineering.

[36]  Michele Lanza,et al.  Evaluating defect prediction approaches: a benchmark and an extensive comparison , 2011, Empirical Software Engineering.

[37]  Thomas Zimmermann,et al.  Predicting Bugs from History , 2008, Software Evolution.

[38]  Martin Pinzger,et al.  Method-level bug prediction , 2012, Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement.

[39]  G. Schwert,et al.  Tests for Unit Roots: a Monte Carlo Investigation , 1988 .

[40]  Jing Li,et al.  The Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies , 2010, 2010 Asia Pacific Software Engineering Conference.

[41]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[42]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[43]  Prashant Palvia,et al.  Software maintenance management: Changes in the last decade , 1990, J. Softw. Maintenance Res. Pract..

[44]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[45]  Chris Chatfield,et al.  Introduction to Statistical Time Series. , 1976 .

[46]  C. Granger Some properties of time series data and their use in econometric model specification , 1981 .

[47]  Nicolas Anquetil,et al.  MSE and FAMIX 3.0: an Interexchange Format and Source Code Model Family , 2011 .

[48]  Ramanath Subramanyam,et al.  Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects , 2003, IEEE Trans. Software Eng..

[49]  Marco Tulio Valente,et al.  COMETS: a dataset for empirical research on software evolution using source code metrics and time series analysis , 2013, SOEN.

[50]  Andreas Zeller,et al.  Predicting defects in SAP Java code: An experience report , 2009, 2009 31st International Conference on Software Engineering - Companion Volume.

[51]  Thomas Ball,et al.  Static analysis tools as early indicators of pre-release defect density , 2005, ICSE.

[52]  Lionel C. Briand,et al.  A Unified Framework for Coupling Measurement in Object-Oriented Systems , 1999, IEEE Trans. Software Eng..

[53]  Gerardo Canfora,et al.  Using multivariate time series and association rules to detect logical change coupling: An empirical study , 2010, 2010 IEEE International Conference on Software Maintenance.

[54]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[55]  Marco Tulio Valente,et al.  Mining the impact of evolution categories on object-oriented metrics , 2012, Software Quality Journal.

[56]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[57]  Michelle Cartwright,et al.  An Empirical Investigation of an Object-Oriented Software System , 2000, IEEE Trans. Software Eng..

[58]  Marco Tulio Valente,et al.  BugMaps-Granger: A Tool for Causality Analysis between Source Code Metrics and Bugs , 2013 .

[59]  簡聰富,et al.  物件導向軟體之架構(Object-Oriented Software Construction)探討 , 1989 .

[60]  Thomas A. Corbi,et al.  Program Understanding: Challenge for the 1990s , 1989, IBM Syst. J..

[61]  Stéphane Ducasse,et al.  Distribution Map , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[62]  Chris F. Kemerer,et al.  Towards a metrics suite for object oriented design , 2017, OOPSLA '91.

[63]  Martin Pinzger,et al.  "A Bug's Life" Visualizing a Bug Database , 2007, 2007 4th IEEE International Workshop on Visualizing Software for Understanding and Analysis.

[64]  Richard C. Holt,et al.  The top ten list: dynamic fault prediction , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[65]  Marco Tulio Valente,et al.  Heuristics for discovering architectural violations , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[66]  Michele Lanza,et al.  A closer look at bugs , 2013, 2013 First IEEE Working Conference on Software Visualization (VISSOFT).

[67]  Andreas Zeller,et al.  Predicting component failures at design time , 2006, ISESE '06.

[68]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[69]  M.M. Lehman,et al.  Programs, life cycles, and laws of software evolution , 1980, Proceedings of the IEEE.

[70]  Marco Tulio Valente,et al.  A gentle introduction to OSGi , 2008, SOEN.

[71]  Audris Mockus,et al.  A large-scale empirical study of just-in-time quality assurance , 2013, IEEE Transactions on Software Engineering.