Architectural Decay as Predictor of Issue- and Change-Proneness

Architectural decay imposes real costs in terms of developer effort, system correctness, and performance. Over time, those problems are likely to be revealed as explicit implementation issues (defects, feature changes, etc.). Recent empirical studies have demonstrated that there is a significant correlation between architectural "smells"—manifestations of architectural decay—and implementation issues. In this paper, we take a step further in exploring this phenomenon. We analyze the available development data from 10 open-source software systems and show that information regarding current architectural decay in these systems can be used to build models that accurately predict future issue-proneness and change-proneness of the systems’ implementations. As a less intuitive result, we also show that, in cases where historical data for a system is unavailable, such data from other, unrelated systems can provide reasonably accurate issue- and change-proneness prediction capabilities.

[1]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[2]  Antonio Martini,et al.  Identifying and Prioritizing Architectural Debt Through Architectural Smells: A Case Study in a Large Software Company , 2018, ECSA.

[3]  Premkumar T. Devanbu,et al.  How, and why, process metrics are better , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[4]  Ruchika Malhotra,et al.  An exploratory study for software change prediction in object-oriented systems using hybridized techniques , 2017, Automated Software Engineering.

[5]  Foutse Khomh,et al.  Analyzing the Impact of Antipatterns on Change-Proneness Using Fine-Grained Source Code Changes , 2012, 2012 19th Working Conference on Reverse Engineering.

[6]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[7]  Nenad Medvidovic,et al.  A comparative analysis of software architecture recovery techniques , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[8]  Premkumar T. Devanbu,et al.  BugCache for inspections: hit or miss? , 2011, ESEC/FSE '11.

[9]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[10]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[13]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[14]  David Lo,et al.  Cross-project build co-change prediction , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[15]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[16]  Nenad Medvidovic,et al.  Enhancing architectural recovery using concerns , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[17]  Nenad Medvidovic,et al.  Architectural-Based Speculative Analysis to Predict Bugs in a Software System , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[18]  Naoyasu Ubayashi,et al.  Revisiting the applicability of the pareto principle to core development teams in open source software projects , 2015, IWPSE.

[19]  Nenad Medvidovic,et al.  Toward Predicting Architectural Significance of Implementation Issues , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[20]  Melvin Alexander Six Sigma Simplified: Quantum Improvement Made Easy , 2002, Technometrics.

[21]  Foster Provost,et al.  Machine Learning from Imbalanced Data Sets 101 , 2008 .

[22]  Richard C. Holt,et al.  Linux as a case study: its extracted software architecture , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[23]  Alessandro F. Garcia,et al.  When Code-Anomaly Agglomerations Represent Architectural Problems? An Exploratory Study , 2014, 2014 Brazilian Symposium on Software Engineering.

[24]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[25]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[26]  Nenad Medvidovic,et al.  Identifying Architectural Bad Smells , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[27]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[28]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[29]  Yuanfang Cai,et al.  Identifying and Quantifying Architectural Debt , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[30]  Sallie M. Henry,et al.  Object-oriented metrics that predict maintainability , 1993, J. Syst. Softw..

[31]  Nenad Medvidovic,et al.  Mapping architectural decay instances to dependency models , 2013, 2013 4th International Workshop on Managing Technical Debt (MTD).

[32]  Osamu Mizuno,et al.  Bug prediction based on fine-grained module histories , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[33]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[34]  Nenad Medvidovic,et al.  Relating Architectural Decay and Sustainability of Software Systems , 2016, 2016 13th Working IEEE/IFIP Conference on Software Architecture (WICSA).

[35]  Richard C. Holt,et al.  ACCD: an algorithm for comprehension-driven clustering , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[36]  Leland Wilkinson,et al.  Revising the Pareto Chart , 2006 .

[37]  KruchtenPhilippe The 4+1 View Model of Architecture , 1995 .

[38]  Abdel Salam Sayyad,et al.  Pareto-optimal search-based software engineering (POSBSE): A literature survey , 2013, 2013 2nd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE).

[39]  Girish Suryanarayana,et al.  Towards a Principle-based Classification of Structural Design Smells , 2013, J. Object Technol..

[40]  Yi Sun,et al.  Some Code Smells Have a Significant but Small Effect on Faults , 2014, TSEM.

[41]  S. Foss,et al.  An Introduction to Heavy-Tailed and Subexponential Distributions , 2011 .

[42]  Nenad Medvidovic,et al.  Toward a Catalogue of Architectural Bad Smells , 2009, QoSA.

[43]  Nenad Medvidovic,et al.  An Empirical Study of Architectural Change in Open-Source Software Systems , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[44]  Lu Xiao Detecting and preventing the architectural roots of bugs , 2014, FSE 2014.

[45]  Sam Malek,et al.  A Study on the Role of Software Architecture in the Evolution and Quality of Software , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[46]  Joshua Garcia A Unified Framework for Studying Architectural Decay of Software Systems , 2014 .

[47]  Paul W. Oman,et al.  Using metrics to evaluate software system maintainability , 1994, Computer.

[48]  Alexander L. Wolf,et al.  Acm Sigsoft Software Engineering Notes Vol 17 No 4 Foundations for the Study of Software Architecture , 2022 .

[49]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[50]  Harald C. Gall,et al.  Analysing Software Repositories to Understand Software Evolution , 2008, Software Evolution.

[51]  Ramanath Subramanyam,et al.  Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects , 2003, IEEE Trans. Software Eng..

[52]  Philippe Kruchten,et al.  The 4+1 View Model of Architecture , 1995, IEEE Softw..

[53]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[54]  David Lo,et al.  Version history, similar report, and structure: putting them together for improved bug localization , 2014, ICPC 2014.

[55]  Nenad Medvidovic,et al.  Are automatically-detected code anomalies relevant to architectural modularity?: an exploratory analysis of evolving systems , 2012, AOSD.

[56]  Barry W. Boehm,et al.  Value-Based Software Engineering: Overview and Agenda , 2006, Value-Based Software Engineering.

[57]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[58]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[59]  Daniele Romano,et al.  Using source code metrics to predict change-prone Java interfaces , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[60]  Mauricio A. Saca Refactoring improving the design of existing code , 2017, 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII).

[61]  Richard N. Taylor,et al.  Software architecture: foundations, theory, and practice , 2009, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[62]  Nenad Medvidovic,et al.  An Empirical Study of Architectural Decay in Open-Source Software , 2018, 2018 IEEE International Conference on Software Architecture (ICSA).