Validating Object-Oriented Design Metrics on a Commercial Java Application

Evidence suggests that most field faults in software applications are found in a small percentage of the software’s components. This means that if these faulty software components can be detected early in the development project’s life cycle, mitigating actions can be taken, such as a redesign. For object-oriented applications, prediction models using design metrics can be used to identify faulty classes early on. Furthermore, once a relationship between the metrics and class fault-proneness has been demonstrated, the metrics can be used to construct design guidelines. In this paper, we present a cognitive theory of object-oriented metrics and an empirical study which has as objectives to formally test this theory while validating the metrics and to build a post-release fault–proneness prediction model. The cognitive mechanisms which we apply in this study to object-oriented metrics are based on contemporary models of human memory. They are: familiarity, interference, and fan effects. Our empirical study was performed with data from a commercial Java application. We found that Depth of Inheritance Tree (DIT) is a good measure of familiarity and, as predicted, has a quadratic relationship with fault–proneness. Our hypotheses were confirmed for Import Coupling to other classes, Export Coupling and Number of Children metrics. The Ancestor based Import Coupling metrics were not associated with fault-proneness after controlling for the confounding effect of DIT. The prediction model constructed had a good accuracy. Finally, we formulated a cost savings model and applied it to our predictive model. This demonstrated a 42% reduction in post-release costs if the prediction model is used to identify the classes that should be inspected.

[1]  María G. Cisneros-Solís,et al.  MEDICAL ANNUAL , 1958, Journal of The Royal Naval Medical Service.

[2]  W. Youden,et al.  Index for rating diagnostic tests , 1950, Cancer.

[3]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[4]  R. Shepard,et al.  Retention of information under conditions approaching a steady state. , 1961, Journal of experimental psychology.

[5]  Donald E. Broadbent,et al.  Decision and stress , 1971 .

[6]  J. Bransford,et al.  Sentence memory: A constructive versus interpretive approach ☆ ☆☆ , 1972 .

[7]  Allen Newell,et al.  Human Problem Solving. , 1973 .

[8]  John R. Anderson Retrieval of propositional information from long-term memory , 1974 .

[9]  D. Dooling,et al.  Intrusion of a thematic idea in retention of prose. , 1974 .

[10]  A. Baddeley,et al.  Word length and the structure of short-term memory , 1975 .

[11]  Ben Shneiderman,et al.  Measuring Computer Program Quality and Comprehension , 1977, Int. J. Man Mach. Stud..

[12]  Walter Kintsch,et al.  Toward a model of text comprehension and production. , 1978 .

[13]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[14]  Will Tracz,et al.  Computer programming and the human thought process , 1979, Softw. Pract. Exp..

[15]  J. T. Galkowski Computer programming and the human thought process , 1979 .

[16]  P. Carpenter,et al.  Individual differences in working memory and reading , 1980 .

[17]  W. W. Muir,et al.  Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1980 .

[18]  John R. Anderson Cognitive Psychology and Its Implications , 1980 .

[19]  Giuseppe Vallar,et al.  Short-Term Forgetting and the Articulatory Loop , 1982 .

[20]  W G Chase,et al.  Exceptional memory. , 1982, American scientist.

[21]  W. Kintsch,et al.  Strategies of discourse comprehension , 1983 .

[22]  Ruven E. Brooks,et al.  Towards a Theory of the Comprehension of Computer Programs , 1983, Int. J. Man Mach. Stud..

[23]  Kate Ehrlich,et al.  An empirical investigation of the tacit plan knowledge in programming , 1984 .

[24]  D. Pregibon,et al.  Graphical Methods for Assessing Logistic Regression Models , 1984 .

[25]  Kate Ehrlich,et al.  Empirical Studies of Programming Knowledge , 1984, IEEE Transactions on Software Engineering.

[26]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[27]  John S. Davis,et al.  Chunks: A basis for complexity measurement , 1984, Inf. Process. Manag..

[28]  H. Simon,et al.  STM capacity for Chinese words and idioms: Chunking and acoustical loop hypotheses , 1985, Memory & cognition.

[29]  D J Spiegelhalter,et al.  Probabilistic prediction in patient management and clinical trials. , 1986, Statistics in medicine.

[30]  F. Schmalhofer,et al.  Three components of understanding a programmer's manual: Verbatim, propositional and situational representations , 1986 .

[31]  N. Pennington Stimulus structures and mental representations in expert comprehension of computer programs , 1987, Cognitive Psychology.

[32]  Nancy Pennington,et al.  Comprehension strategies in programming , 1987 .

[33]  V. Flack,et al.  Frequency of Selecting Noise Variables in Subset Regression Analysis: A Simulation Study , 1987 .

[34]  Jakob Nielsen,et al.  The experience of learning and using Smalltalk , 1989, IEEE Software.

[35]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[36]  A. Baddeley Human Memory: Theory and Practice, Revised Edition , 1990 .

[37]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[38]  Norman E. Fenton,et al.  Software Metrics: A Rigorous Approach , 1991 .

[39]  Scott P. Robertson,et al.  Expert problem solving strategies for program comprehension , 1991, CHI.

[40]  Johnette Hassell,et al.  Information Relationships in PROLOG Programs: How Do Programmers Comprehend Functionality? , 1991, Int. J. Man Mach. Stud..

[41]  M. Just,et al.  From the SelectedWorks of Marcel Adam Just 1992 A capacity theory of comprehension : Individual differences in working memory , 2017 .

[42]  Taghi M. Khoshgoftaar,et al.  The Detection of Fault-Prone Programs , 1992, IEEE Trans. Software Eng..

[43]  H. Keselman,et al.  Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables , 1992 .

[44]  Steven P. Reiss,et al.  Support for Maintaining Object-Oriented Programs , 1992, IEEE Trans. Software Eng..

[45]  Sallie M. Henry,et al.  Object-oriented metrics that predict maintainability , 1993, J. Syst. Softw..

[46]  Norman Wilde,et al.  Maintaining object-oriented software , 1993, IEEE Software.

[47]  楠本 真二,et al.  Quantitative Evaluation of Software Reviews and Testing Processes , 1993 .

[48]  Lawrence G. Votta,et al.  Does every inspection need a meeting? , 1993, SIGSOFT '93.

[49]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[50]  John R. Anderson,et al.  Rules of the Mind , 1993 .

[51]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[52]  Tsuneo Furuyama,et al.  Fault generation model and mental stress effect analysis , 1994, J. Syst. Softw..

[53]  Joseph Dvorak Conceptual entropy and its effect on class hierarchies , 1994, Computer.

[54]  Fernando Brito e Abreu,et al.  Object-Oriented Software Engineering: Measuring and Controlling the Development Process , 1994 .

[55]  Brian Henderson-Sellers,et al.  Application of Cognitive Complexity Metrics to Object-Oriented Programs , 1994, Journal of object-oriented programming.

[56]  John W. Daly,et al.  Structured Interviews on the Object-Oriented Paradigm , 1995 .

[57]  Anneliese Amschler Andrews,et al.  Program Understanding: Models and Experiments , 1995, Adv. Comput..

[58]  Anneliese Amschler Andrews,et al.  Industrial experience with an integrated code comprehension model , 1995, Softw. Eng. J..

[59]  Iris Vessey,et al.  Research Report - The Relevance of Application Domain Knowledge: The Case of Computer Program Comprehension , 1995, Inf. Syst. Res..

[60]  Brian Henderson-Sellers,et al.  A conceptual model of cognitive complexity of elements of the programming process , 1995, Inf. Softw. Technol..

[61]  John W. Daly,et al.  Issues on the Object-Oriented Paradigm: A Questionnaire Survey , 1995 .

[62]  Brian Henderson-Sellers,et al.  Object-Oriented Metrics , 1995, TOOLS.

[63]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[64]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[65]  Ishbel Duncan,et al.  An exploratory study of common coding faults in C programs , 1996 .

[66]  Anneliese Amschler Andrews,et al.  Identification of Dynamic Comprehension Processes During Large Scale Maintenance , 1996, IEEE Trans. Software Eng..

[67]  A. Marie Vans,et al.  Program understanding behavior during debugging of large scale software , 1997, ESP '97.

[68]  Françoise Détienne,et al.  Mental Representations Constructed by Experts and Novices in Object-Oriented Program Comprehension , 1997, INTERACT.

[69]  Adele E. Howe,et al.  Program understanding behaviour during enhancement of large-scale software , 1997, J. Softw. Maintenance Res. Pract..

[70]  Sandro Morasca,et al.  Knowledge Discovery from Software Engineering Measurement Data: A Comparative Study of Two Analysis Techniques , 1997, ICSE 1997.

[71]  Les Hatton,et al.  Reexamining the Fault Density-Component Size Connection , 1997, IEEE Softw..

[72]  Premkumar T. Devanbu,et al.  An Investigation into Coupling Measures for C++ , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[73]  Tsuneo Furuyama,et al.  Analysis of fault generation caused by stress during software development , 1997, J. Syst. Softw..

[74]  Michael Philippsen,et al.  The impact of inheritance depth on maintenance tasks - Detailed description and evaluation of two experiment replications , 1998 .

[75]  Françoise Détienne,et al.  The effect of object-oriented programming expertise in several dimensions of comprehension strategies , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[76]  Lionel C. Briand,et al.  Predicting fault-prone classes with design measures in object-oriented systems , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[77]  Les Hatton,et al.  Does OO Sync with How We Think? , 1998, IEEE Softw..

[78]  Taghi M. Khoshgoftaar,et al.  Return on investment of software quality predictions , 1998, Proceedings. 1998 IEEE Workshop on Application-Specific Software Engineering and Technology. ASSET-98 (Cat. No.98EX183).

[79]  Stephen R. Schach,et al.  Validation of the coupling dependency metric as a predictor of run-time failures and maintenance measures , 1998, Proceedings of the 20th International Conference on Software Engineering.

[80]  Michelle Cartwright,et al.  An empirical view of inheritance , 1998, Inf. Softw. Technol..

[81]  Lionel C. Briand,et al.  A Comprehensive Investigation of Quality Factors in Object-Oriented Designs: an Industrial Case Study , 1998 .

[82]  David P. Darcy,et al.  Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis , 1998, IEEE Trans. Software Eng..

[83]  Ruven E. Brooks Towards a theory of the cognitive processes in computer programming , 1999, Int. J. Hum. Comput. Stud..

[84]  Susan Wiedenbeck,et al.  Mental representations of expert procedural and object-oriented programmers in a software maintenance task , 1999, Int. J. Hum. Comput. Stud..

[85]  Susan Wiedenbeck,et al.  A comparison of the comprehension of object-oriented and procedural programs by novice programmers , 1999, Interact. Comput..

[86]  A Controlled Experiment on Inheritance Depth as a Cost Factor for Maintenance , 1999 .

[87]  Khaled El Emam,et al.  A Validation of Object-oriented Metrics , 1999 .

[88]  Lionel C. Briand,et al.  Investigating quality factors in object-oriented designs: an industrial case study , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[89]  Walcélio L. Melo,et al.  Polymorphism measures for early risk prediction , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[90]  Lionel C. Briand,et al.  A Unified Framework for Coupling Measurement in Object-Oriented Systems , 1999, IEEE Trans. Software Eng..

[91]  Norman E. Fenton,et al.  Software metrics: successes, failures and new directions , 1999, J. Syst. Softw..

[92]  Mei-Hwa Chen,et al.  An empirical study on object-oriented metrics , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[93]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[94]  M. Roper,et al.  Object-oriented inspection in the face of delocalisation , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[95]  Colin Atkinson,et al.  An experimental comparison of reading techniques for defect detection in UML design documents , 2000, J. Syst. Softw..

[96]  Michelle Cartwright,et al.  An Empirical Investigation of an Object-Oriented Software System , 2000, IEEE Trans. Software Eng..

[97]  Lionel C. Briand,et al.  Exploring the relationships between design measures and software quality in object-oriented systems , 2000, J. Syst. Softw..

[98]  Khaled El Emam,et al.  Thresholds for object-oriented measures , 2000, Proceedings 11th International Symposium on Software Reliability Engineering. ISSRE 2000.

[99]  Susan Wiedenbeck,et al.  Direction and scope of comprehension-related activities by procedural and object-oriented programmers: an empirical study , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[100]  Evelyn C. Ferstl Learning from Text , 2001 .

[101]  Javam C. Machado,et al.  The prediction of faulty classes using object-oriented design metrics , 2001, J. Syst. Softw..

[102]  Harvey P. Siy,et al.  Does the modern code inspection have value? , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[103]  Khaled El Emam,et al.  Comparing case-based reasoning classifiers for predicting high risk software components , 2001, J. Syst. Softw..

[104]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[105]  Khaled El Emam,et al.  The Optimal Class Size for Object-Oriented Software , 2002, IEEE Trans. Software Eng..