Using information retrieval based coupling measures for impact analysis

Coupling is an important property of software systems, which directly impacts program comprehension. In addition, the strength of coupling measured between modules in software is often used as a predictor of external software quality attributes such as changeability, ripple effects of changes and fault-proneness. This paper presents a new set of coupling measures for Object-Oriented (OO) software systems measuring conceptual coupling of classes. Conceptual coupling is based on measuring the degree to which the identifiers and comments from different classes relate to each other. This type of relationship, called conceptual coupling, is measured through the use of Information Retrieval (IR) techniques. The proposed measures are different from existing coupling measures and they capture new dimensions of coupling, which are not captured by the existing coupling measures. The paper investigates the use of the conceptual coupling measures during change impact analysis. The paper reports the findings of a case study in the source code of the Mozilla web browser, where the conceptual coupling metrics were compared to nine existing structural coupling metrics and proved to be better predictors for classes impacted by changes.

[1]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[2]  Letha H. Etzkorn,et al.  Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile Software Development Processes , 2007, IEEE Transactions on Software Engineering.

[3]  Letha H. Etzkorn,et al.  A semantic entropy metric , 2002, J. Softw. Maintenance Res. Pract..

[4]  Ewan D. Tempero,et al.  Detecting indirect coupling , 2005, 2005 Australian Software Engineering Conference.

[5]  Václav Rajlich,et al.  Hidden dependencies in program comprehension and change propagation , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[6]  Feng-Jian Wang,et al.  Measuring Coupling And Cohesion of an Object-Oriented Program , 1995 .

[7]  Gail E. Kaiser,et al.  An Information Retrieval Approach For Automatically Constructing Software Libraries , 1991, IEEE Trans. Software Eng..

[8]  Sushil Krishna Bajracharya,et al.  Mining Eclipse Developer Contributions via Author-Topic Models , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[9]  Shawn A. Bohner,et al.  Impact analysis in the software change process: a year 2000 perspective , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[10]  Soo Dong Kim,et al.  Component identification method with coupling and cohesion , 2001, Proceedings Eighth Asia-Pacific Software Engineering Conference.

[11]  F KemererChris,et al.  Towards a metrics suite for object oriented design , 1991 .

[12]  Ioana M. Boier-Martin,et al.  Visualization Viewpoints , 2000 .

[13]  Václav Rajlich,et al.  Case study of feature location using dependence graph , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[14]  Letha H. Etzkorn,et al.  Towards a semantic metrics suite for object-oriented design , 2000, Proceedings. 34th International Conference on Technology of Object-Oriented Languages and Systems - TOOLS 34.

[15]  Chris F. Kemerer,et al.  Towards a metrics suite for object oriented design , 2017, OOPSLA '91.

[16]  I. Jolliffe Principal Component Analysis , 2002 .

[17]  Gregg Rothermel,et al.  An empirical comparison of dynamic impact analysis algorithms , 2004, Proceedings. 26th International Conference on Software Engineering.

[18]  Letha H. Etzkorn,et al.  Automatically Identifying Reusable OO Legacy Code , 1997, Computer.

[19]  James F. Power,et al.  A study of the influence of coverage on the relationship between static and dynamic coupling metrics , 2006, Sci. Comput. Program..

[20]  Taghi M. Khoshgoftaar,et al.  Measuring coupling and cohesion of software modules: an information-theory approach , 2001, Proceedings Seventh International Software Metrics Symposium.

[21]  Lionel C. Briand,et al.  A comprehensive empirical validation of design measures for object-oriented systems , 1998, Proceedings Fifth International Software Metrics Symposium. Metrics (Cat. No.98TB100262).

[22]  Amir Michail,et al.  Assessing software libraries by browsing similar classes, functions and relationships , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[23]  Robert K. Yin,et al.  Applications of case study research , 1993 .

[24]  Tao Xie,et al.  An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[25]  Javam C. Machado,et al.  The prediction of faulty classes using object-oriented design metrics , 2001, J. Syst. Softw..

[26]  Malcolm Munro,et al.  The impact analysis task in software maintenance: a model and a case study , 1994, Proceedings 1994 International Conference on Software Maintenance.

[27]  Massimiliano Di Penta,et al.  An approach to classify software maintenance requests , 2002, International Conference on Software Maintenance, 2002. Proceedings..

[28]  Lionel C. Briand,et al.  Using coupling measurement for impact analysis in object-oriented systems , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[29]  Janice Singer,et al.  Hipikat: a project memory for software development , 2005, IEEE Transactions on Software Engineering.

[30]  Alfred V. Aho,et al.  CERBERUS: Tracing Requirements to Source Code Using Information Retrieval, Dynamic Analysis, and Program Analysis , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[31]  Gerhard Fischer,et al.  Reuse-Conducive Development Environments , 2005, Automated Software Engineering.

[32]  Andrian Marcus,et al.  Supporting program comprehension using semantic and structural information , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[33]  Giuliano Antoniol,et al.  Using metrics to identify design patterns in object-oriented software , 1998, Proceedings Fifth International Software Metrics Symposium. Metrics (Cat. No.98TB100262).

[34]  Denys Poshyvanyk,et al.  Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code , 2007, 15th IEEE International Conference on Program Comprehension (ICPC '07).

[35]  William C. Chu,et al.  A measure for composite module cohesion , 1992, International Conference on Software Engineering.

[36]  Wei Zhao,et al.  SNIAFL: towards a static non-interactive approach to feature location , 2004, Proceedings. 26th International Conference on Software Engineering.

[37]  Steve Counsell,et al.  A dynamic runtime coupling metric for meta-level architectures , 2004, Eighth European Conference on Software Maintenance and Reengineering, 2004. CSMR 2004. Proceedings..

[38]  Yann-Gaël Guéhéneuc,et al.  Mining the Lexicon Used by Programmers during Sofware Evolution , 2007, 2007 IEEE International Conference on Software Maintenance.

[39]  Tibor Gyimóthy,et al.  Columbus - reverse engineering tool and schema for C++ , 2002, International Conference on Software Maintenance, 2002. Proceedings..

[40]  Denis Gracanin,et al.  Software impact analysis in a virtual environment , 2003, 28th Annual NASA Goddard Software Engineering Workshop, 2003. Proceedings..

[41]  Jianjun Zhao Measuring Coupling in Aspect-Oriented Systems , 2004 .

[42]  Andrian Marcus,et al.  An information retrieval approach to concept location in source code , 2004, 11th Working Conference on Reverse Engineering.

[43]  Victoria Interrante,et al.  User Studies: Why, How, and When? , 2003, IEEE Computer Graphics and Applications.

[44]  Lei Wang,et al.  Relevancy based semantic interoperation of reuse repositories , 2004, SIGSOFT '04/FSE-12.

[45]  Hermann Kaindl,et al.  Coupling and cohesion metrics for knowledge-based systems using frames and rules , 2004, TSEM.

[46]  Harald C. Gall,et al.  CVS release history data for detecting logical couplings , 2003, Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings..

[47]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[48]  Bernd Fischer Specification-Based Browsing of Software Component Libraries , 2004, Automated Software Engineering.

[49]  Giuliano Antoniol,et al.  Recovering Traceability Links between Code and Documentation , 2002, IEEE Trans. Software Eng..

[50]  Per Runeson,et al.  Detection of Duplicate Defect Reports Using Natural Language Processing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[51]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[52]  Premkumar T. Devanbu,et al.  An Investigation into Coupling Measures for C++ , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[53]  Hausi A. Müller,et al.  Predicting fault-proneness using OO metrics. An industrial case study , 2002, Proceedings of the Sixth European Conference on Software Maintenance and Reengineering.

[54]  Beszé Columbus - Reverse Engineering Tool and Schema for C++ , 2002 .

[55]  Andrian Marcus,et al.  Static techniques for concept location in object-oriented code , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[56]  Jane Cleland-Huang,et al.  Utilizing supporting evidence to improve dynamic requirements traceability , 2005, 13th IEEE International Conference on Requirements Engineering (RE'05).

[57]  Fernando Brito e Abreu,et al.  A coupling-guided cluster analysis approach to reengineer the modularity of object-oriented systems , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[58]  Andrian Marcus,et al.  Identification of high-level concept clones in source code , 2001, Proceedings 16th Annual International Conference on Automated Software Engineering (ASE 2001).

[59]  Rudolf Ferenc,et al.  Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems , 2008, IEEE Transactions on Software Engineering.

[60]  Elisa L. A. Baniassad,et al.  Isolating and relating concerns in requirements using latent semantic analysis , 2006, OOPSLA '06.

[61]  B. Flyvbjerg Five Misunderstandings About Case-Study Research , 2006, 1304.1186.

[62]  Michael W. Godfrey,et al.  Detecting Interaction Coupling from Task Interaction Histories , 2007, 15th IEEE International Conference on Program Comprehension (ICPC '07).

[63]  Letha H. Etzkorn,et al.  A New Suite of Metrics for Object-Oriented Software , 2004, Software Audit and Metrics.

[64]  Barbara G. Ryder,et al.  Points-to analysis for Java using annotated constraints , 2001, OOPSLA '01.

[65]  Lionel C. Briand,et al.  Dynamic coupling measurement for object-oriented software , 2004, IEEE Transactions on Software Engineering.

[66]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[67]  Arie van Deursen,et al.  Can LSI help reconstructing requirements traceability in design and test? , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[68]  Jane Huffman Hayes,et al.  Advancing candidate link generation for requirements tracing: the study of methods , 2006, IEEE Transactions on Software Engineering.

[69]  Denys Poshyvanyk,et al.  The conceptual cohesion of classes , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[70]  Sallie M. Henry,et al.  Object-oriented metrics that predict maintainability , 1993, J. Syst. Softw..

[71]  Genny Tortora,et al.  Recovering traceability links in software artifact management systems using information retrieval methods , 2007, TSEM.

[72]  Katsuro Inoue,et al.  MUDABlue: An Automatic Categorization System for Open Source Repositories , 2004, APSEC.

[73]  A. Jefferson Offutt,et al.  A software metric system for module coupling , 1993, J. Syst. Softw..

[74]  Yoelle Maarek,et al.  Integrating information retrieval and domain specific approaches for browsing and retrieval in object-oriented class libraries , 1991, OOPSLA '91.

[75]  Yann-Gaël Guéhéneuc,et al.  Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval , 2007, IEEE Transactions on Software Engineering.

[76]  Lionel C. Briand,et al.  A Unified Framework for Coupling Measurement in Object-Oriented Systems , 1999, IEEE Trans. Software Eng..

[77]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[78]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[79]  Andrian Marcus,et al.  Recovery of Traceability Links between Software Documentation and Source Code , 2005, Int. J. Softw. Eng. Knowl. Eng..

[80]  David W. Binkley,et al.  Leveraged Quality Assessment using Information Retrieval Techniques , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[81]  Letha H. Etzkorn,et al.  Coupling metrics for ontology-based system , 2006, IEEE Software.

[82]  Gerardo Canfora,et al.  Impact analysis by mining software and change request repositories , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[83]  Stéphane Ducasse,et al.  Semantic clustering: Identifying topics in source code , 2007, Inf. Softw. Technol..

[84]  Barbara A. Kitchenham,et al.  Coupling measures and change ripples in C++ application software , 2000, J. Syst. Softw..

[85]  Martin P. Robillard,et al.  Automatic generation of suggestions for program investigation , 2005, ESEC/FSE-13.

[86]  Giuliano Antoniol,et al.  Identifying the starting impact set of a maintenance request: a case study , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[87]  Emily Hill,et al.  Exploring the neighborhood with dora to expedite software maintenance , 2007, ASE '07.