An Empirical Validation of Cognitive Complexity as a Measure of Source Code Understandability

Background: Developers spend a lot of their time on understanding source code. Static code analysis tools can draw attention to code that is difficult for developers to understand. However, most of the findings are based on non-validated metrics, which can lead to confusion and code that is hard to understand not being identified. Aims: In this work, we validate a metric called Cognitive Complexity which was explicitly designed to measure code understandability and which is already widely used due to its integration in well-known static code analysis tools. Method: We conducted a systematic literature search to obtain data sets from studies which measured code understandability. This way we obtained about 24,000 understandability evaluations of 427 code snippets. We calculated the correlations of these measurements with the corresponding metric values and statistically summarized the correlation coefficients through a meta-analysis. Results: Cognitive Complexity positively correlates with comprehension time and subjective ratings of understandability. The metric showed mixed results for the correlation with the correctness of comprehension tasks and with physiological measures. Conclusions: It is the first validated and solely code-based metric which is able to reflect at least some aspects of code understandability. Moreover, due to its methodology, this work shows that code understanding is currently measured in many different ways, which we also do not know how they are related. This makes it difficult to compare the results of individual studies as well as to develop a metric that measures code understanding in all its facets.

[1]  Jacqueline L. Whalley,et al.  Measuring the difficulty of code comprehension tasks using software metrics , 2013, ACE '13.

[2]  Andrew Begel,et al.  Using psycho-physiological measures to assess task difficulty in software development , 2014, ICSE.

[3]  D. A. Walker,et al.  JMASM9: Converting Kendall’s Tau For Correlational Or Meta-Analytic Analyses , 2003 .

[4]  Zhenchang Xing,et al.  Measuring Program Comprehension: A Large-Scale Field Study with Professionals , 2018, IEEE Transactions on Software Engineering.

[5]  Westley Weimer,et al.  Decoding the Representation of Code in the Brain: An fMRI Study of Code Review and Expertise , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[6]  Susan Wiedenbeck,et al.  Novice comprehension of small programs written in the procedural and object-oriented styles , 1999, Int. J. Hum. Comput. Stud..

[7]  Barbara Paech,et al.  The Role of Method Chains and Comments in Software Readability and Comprehension—An Experiment , 2016, IEEE Transactions on Software Engineering.

[8]  Thomas Leich,et al.  Toward measuring program comprehension with functional magnetic resonance imaging , 2012, SIGSOFT FSE.

[9]  Susan Wiedenbeck,et al.  An empirical study of novice program comprehension in the imperative and object-oriented styles , 1997, ESP '97.

[10]  Mira Mezini,et al.  An empirical study on program comprehension with reactive programming , 2014, SIGSOFT FSE.

[11]  Anneliese Amschler Andrews,et al.  Program Comprehension During Software Maintenance and Evolution , 1995, Computer.

[12]  Thomas Leich,et al.  A Look into Programmers’ Heads , 2020, IEEE Transactions on Software Engineering.

[13]  Barry W. Boehm,et al.  Quantitative evaluation of software quality , 1976, ICSE '76.

[14]  Tuan Nguyen,et al.  "Automatically Assessing Code Understandability" Reanalyzed: Combined Metrics Matter , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[15]  Abdullah Mohd Zin,et al.  A study on the program comprehension and debugging processes of novice programmers , 2012 .

[16]  Sara Shahzad,et al.  Cyclomatic complexity: The nesting problem , 2013, Eighth International Conference on Digital Information Management (ICDIM 2013).

[17]  Jacqueline Murray Likert Data: What to Use, Parametric or Non-Parametric? , 2013 .

[18]  Janet Siegmund,et al.  Program Comprehension: Past, Present, and Future , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[19]  Larry V. Hedges,et al.  Effect Sizes Based on Correlations , 2009, Introduction to Meta‐Analysis.

[20]  Nicole Novielli,et al.  A Replication Study on Code Comprehension and Expertise using Lightweight Biometric Sensors , 2019, 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC).

[21]  Jim Buckley,et al.  Expectation-based, inference-based, and bottom-up software comprehension , 2004, J. Softw. Maintenance Res. Pract..

[22]  Claes Wohlin,et al.  Guidelines for snowballing in systematic literature studies and a replication in software engineering , 2014, EASE '14.

[23]  G. Ann Campbell,et al.  Cognitive Complexity — An Overview and Evaluation , 2018, 2018 IEEE/ACM International Conference on Technical Debt (TechDebt).

[24]  Dror G. Feitelson,et al.  On the effect of code regularity on comprehension , 2014, ICPC 2014.

[25]  Yu Yan,et al.  Understanding misunderstandings in source code , 2017, ESEC/SIGSOFT FSE.

[26]  Mark Harman,et al.  An Empirical Investigation of the Influence of a Type of Side Effects on Program Comprehension , 2003, IEEE Trans. Software Eng..

[27]  Bonita Sharif,et al.  An eye-tracking study assessing the comprehension of c++ and Python source code , 2014, ETRA.

[28]  Larry V. Hedges,et al.  When Does it Make Sense to Perform a Meta‐Analysis? , 2009, Introduction to Meta‐Analysis.

[29]  Westley Weimer,et al.  Learning a Metric for Code Readability , 2010, IEEE Transactions on Software Engineering.

[30]  Jacob Cohen,et al.  CHAPTER 3 – The Significance of a Product Moment rs , 1977 .

[31]  Yu Yan,et al.  Detecting and comparing brain activity in short program comprehension using EEG , 2017, 2017 IEEE Frontiers in Education Conference (FIE).

[32]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[33]  Lucas Gren,et al.  Do internal software quality tools measure validated metrics? , 2019, PROFES.

[34]  Dror G. Feitelson,et al.  Syntax, predicates, idioms — what really affects code complexity? , 2018, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[35]  Sven Apel,et al.  Exploring Software Measures to Assess Program Comprehension , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[36]  Susan Wiedenbeck,et al.  A comparison of the comprehension of object-oriented and procedural programs by novice programmers , 1999, Interact. Comput..

[37]  Gabriele Bavota,et al.  Automatically Assessing Code Understandability , 2019, IEEE Transactions on Software Engineering.

[38]  Scott N. Woodfield,et al.  The effect of modularization and comments on program comprehension , 1981, ICSE '81.

[39]  Michele Lanza,et al.  I know what you did last summer: an investigation of how developers spend their time , 2015, ICPC '15.

[40]  Tom A. B. Snijders,et al.  Random‐Effects Model , 2003, Introduction to Meta‐Analysis.

[41]  Todd Sedano,et al.  Code Readability Testing, an Empirical Study , 2016, 2016 IEEE 29th International Conference on Software Engineering Education and Training (CSEET).

[42]  Janet Siegmund,et al.  Shorter identifier names take longer to comprehend , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[43]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.