Investigating the architectural drivers of defects in open-source software systems : an empirical study of defects and reopened defects in GNOME

In major software systems that are developed by competent software engineers, the existence of defects in production is unlikely to be an acceptable situation. And yet, we find that in several such systems, defects remain a reality. Furthermore, the number of changes that are fixed only to then be reopened is noticeable. The implications of having defects in a system can be frustrating for all stakeholders, and when they require constant rework, they can lead to the problematic code-test-code-test mode of development. For management, such conditions can result in slipped schedules and an increase in development costs and for upper management and users, they can result in losing confidence in the product. This study looks at the drivers of defects in the mature open-source project GNOME and explores the relationship between the various drivers of these defects and software quality. Using defect-activity and source-code data for 32 systems over a period of eight years, the work presents a multiple regression model capable of explaining 16.2% of defects and a logistic regression model capable of explaining between 13.6% and 18.1% of reopened defects. The study also shows that although defects in general and reopened defects appear to move together, defects in general correlate with a measure of complexity that captures how components connect to each other whereas reopened defects correlate with a measure that captures the inner complexities of components, thereby suggesting that different types of defects are correlated with different forms of complexity. Thesis advisor: Alan D. MacCormack Title: Adjunct Professor, Harvard Business School, Harvard University Formerly, Visiting Associate Professor, Sloan School of Management

[1]  Frederick P. Brooks,et al.  No Silver Bullet: Essence and Accidents of Software Engineering , 1987 .

[2]  Robert P. Smith,et al.  A model-based method for organizing tasks in product development , 1994 .

[3]  Joseph K. Kearney,et al.  Software complexity measurement , 1986, CACM.

[4]  Kim B. Clark,et al.  Design Rules: The Power of Modularity , 2000 .

[5]  Victor R. Basili,et al.  Analyzing Error-Prone System Structure , 1991, IEEE Trans. Software Eng..

[6]  Alan MacCormack,et al.  The Architecture of Complex Systems: Do Core-Periphery Structures Dominate? , 2010 .

[7]  Alan MacCormack,et al.  Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code , 2006, Manag. Sci..

[8]  Glenford J. Myers,et al.  Composite/structured design , 1978 .

[9]  Carol Withrow,et al.  Error density and size in Ada software , 1990, IEEE Software.

[10]  T. Capers Jones,et al.  Estimating software costs , 1998 .

[11]  Fred P. Brooks,et al.  The Mythical Man-Month , 1975, Reliable Software.

[12]  Robert L. Nord,et al.  Managing technical debt in software-reliant systems , 2010, FoSER '10.

[13]  Pankaj Jalote,et al.  Software Project Management in Practice , 2002 .

[14]  Stephen H. Kan,et al.  Metrics and Models in Software Quality Engineering , 1994, SOEN.

[15]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[16]  Philip J. Guo,et al.  Characterizing and predicting which bugs get reopened , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[17]  Glenford J. Myers,et al.  Structured Design , 1999, IBM Syst. J..

[18]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[19]  Jean-Marc Jézéquel,et al.  Design by Contract: The Lessons of Ariane , 1997, Computer.

[20]  D. V. Steward,et al.  The design structure system: A method for managing the design of complex systems , 1981, IEEE Transactions on Engineering Management.

[21]  Ken-ichi Matsumoto,et al.  Predicting Re-opened Bugs: A Case Study on the Eclipse Project , 2010, 2010 17th Working Conference on Reverse Engineering.

[22]  L. Hatton Invited Talk: The Role of Empiricism in Improving the Reliability of Future Software , 2008 .

[23]  Audris Mockus,et al.  Software Dependencies, Work Dependencies, and Their Impact on Failures , 2009, IEEE Transactions on Software Engineering.

[24]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[25]  Mary Shaw,et al.  Software architecture - perspectives on an emerging discipline , 1996 .

[26]  Thomas Zimmermann,et al.  When do changes induce fixes? On Fridays , 2005 .

[27]  Manuel E. Sosa,et al.  Can We Predict the Generation of Bugs ? Software Architecture and Quality in Open-Source Development , 2009 .

[28]  J. Herbsleb,et al.  Success in Online Production Systems : A Longitudinal Analysis of the Socio-Technical Duality of Development Projects , 2010 .

[29]  John N. Warfield,et al.  Binary Matrices in System Modeling , 1973, IEEE Trans. Syst. Man Cybern..

[30]  Victor R. Basili,et al.  System Structure Analysis: Clustering with Data Bindings , 1985, IEEE Transactions on Software Engineering.

[31]  Alan MacCormack,et al.  The Impact of Component Modularity on Design Evolution: Evidence from the Software Industry , 2007 .