A Human-Centric Approach to Program Understanding

Software development is a large global industry, but software products continue to ship with known and unknown defects [60]. In the US, such defects cost firms many billions of dollars annually by compromising security, privacy, and functionality [73]. To mitigate this expense, recent research has focused on finding specific errors in code (e.g., [13, 25, 29, 34, 35, 47, 48, 61, 66, 86]). These important analyses hold out the possibility of identifying many types of implementation issues, but they fail to address a problem underlying all of them: software is difficult to understand. Professional software developers spend over 75% of their time trying to understand code [45, 76]. Reading code is the most time consuming part [31, 39, 78, 85] of the most expensive activity [77, 87] in the software development process. Yet, software comprehension as an activity is poorly understood by both researchers and practitioners [74, 106]. Our research seeks to develop a general and practical approach for analyzing program understandability from the perspective of real humans. In addition, we propose to develop tools for mechanically generating documentation in order to make programs easier to understand. We will focus on three key dimensions of program understandability: readability, a local judgment of how easy code is to understand; runtime behavior, a characterization of what a program was designed to do; and documentation, non-code text that aids in program understanding. Our key technical insight lies in combining multiple surface features (e.g., identifier length or number of assignment statements) to characterize aspects of programs that lack precise semantics. The use of lightweight features permits our techniques to scale to large programs and generalize across multiple application domains. Additionally, we will continue to pioneer techniques [19] for generating output that is directly comparable to real-world human-created documentation. This is useful for evaluation, but also suggests that our proposed tools could be readily integrated into current software engineering practice. Software understandability becomes increasingly important as the number and size of software projects grow: as complexity increases, it becomes paramount to comprehend software and use it correctly. Fred Brooks once noted that “the most radical possible solution for constructing software is not to construct it at all” [16], and instead assemble already-constructed pieces. Code reuse and composition are becoming increasingly important: a recent study found that a set of programs was comprised of 32% re-used code (not including libraries) [88], whereas a similar 1987 study estimated the figure at only 5% [38]. In 2005, a NASA survey found that the most significant barrier to reuse is that software is too difficult to understand or is poorly documented [42] — above even requirements or compatibility. In a future where software engineering focus shifts from implementation to design and composition concerns, program understandability will become even more important.

[1]  Shihong Huang,et al.  Towards a documentation maturity model , 2003, SIGDOC '03.

[2]  Andy Chou,et al.  Bugs as Inconsistent Behavior: A General Approach to Inferring Errors in Systems Code. , 2001, SOSP 2001.

[3]  Scott W. Ambler Java coding standards , 1997 .

[4]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[5]  George C. Necula,et al.  The design and implementation of a certifying compiler (with retrospective) , 1998, PLDI 1998.

[6]  David Hovemeyer,et al.  Finding bugs is easy , 2004, SIGP.

[7]  Sriram K. Rajamani,et al.  Automatically validating temporal safety properties of interfaces , 2001, SPIN '01.

[8]  Gregory Tassey,et al.  Prepared for what , 2007 .

[9]  J. Peter Kincaid,et al.  Derivation and Validation of the Automated Readability Index for Use with Technical Materials , 1970 .

[10]  Y. Inoue,et al.  A new tool to assess sarcoidosis severity. , 2006, Chest.

[11]  Tom DeMarco Why Does Software Cost So Much , 1995 .

[12]  Martin P. Robillard,et al.  What Makes APIs Hard to Learn? Answers from Developers , 2009, IEEE Software.

[13]  K. K. Aggarwal,et al.  An integrated measure of software maintainability , 2002, Annual Reliability and Maintainability Symposium. 2002 Proceedings (Cat. No.02CH37318).

[14]  Darrell R. Raymond,et al.  Reading source code , 1991, CASCON.

[15]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.

[16]  Angelos D. Keromytis,et al.  Countering network worms through automatic patch generation , 2005, IEEE Security & Privacy Magazine.

[17]  Vivek Sarkar,et al.  A comparative study of static and profile-based heuristics for inlining , 2000, Dynamo.

[18]  Westley Weimer,et al.  Patches as better bug reports , 2006, GPCE '06.

[19]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[20]  Larry Carter,et al.  Path Analysis and Renaming for Predicated Instruction Scheduling , 2004, International Journal of Parallel Programming.

[21]  William B. Frakes,et al.  Software reuse through information retrieval , 1986, SIGF.

[22]  Westley Weimer,et al.  The road not taken: Estimating path execution frequency statically , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[23]  José Nelson Amaral,et al.  Aestimo: a feedback-directed optimization evaluation tool , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[24]  Jonathan Aldrich,et al.  Practical Exception Specifications , 2006, Advanced Topics in Exception Handling Techniques.

[25]  Phillip A. Relf,et al.  Tool assisted identifier naming for improved software readability: an empirical study , 2005, 2005 International Symposium on Empirical Software Engineering, 2005..

[26]  Nicholas Jalbert,et al.  Automated duplicate detection for bug tracking systems , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[27]  Koushik Sen,et al.  Concolic testing , 2007, ASE.

[28]  Guido Rossum,et al.  Internet Programming With Python , 1996 .

[29]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[30]  Thomas Ball,et al.  Edge profiling versus path profiling: the showdown , 1998, POPL '98.

[31]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[32]  James R. Larus,et al.  Efficient path profiling , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[33]  Westley Weimer,et al.  Privately Finding Specifications , 2008, IEEE Transactions on Software Engineering.

[34]  Gustavo Alonso,et al.  Enhancing the fault tolerance of workflow management systems , 2000, IEEE Concurr..

[35]  Mary Beth Rosson Human factors in programming and software development , 1996, CSUR.

[36]  Michael D. Ernst,et al.  Automatically patching errors in deployed software , 2009, SOSP '09.

[37]  Rupak Majumdar,et al.  Path slicing , 2005, PLDI '05.

[38]  HarroldMary Jean,et al.  Active learning for automatic classification of software behavior , 2004 .

[39]  Westley Weimer,et al.  Modeling bug report quality , 2007, ASE '07.

[40]  Dawson R. Engler,et al.  Z-Ranking: Using Statistical Analysis to Counter the Impact of Static Analysis Approximations , 2003, SAS.

[41]  Koushik Sen,et al.  CUTE and jCUTE: Concolic Unit Testing and Explicit Path Model-Checking Tools , 2006, CAV.

[42]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[43]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[44]  Rajiv Gupta,et al.  Profile-Guided Compiler Optimizations , 2002, The Compiler Design Handbook.

[45]  Paul Anderson,et al.  Software Inspection Using CodeSurfer , 2001 .

[46]  Forrest Shull,et al.  Investigating Reading Techniques for Object-Oriented Framework Learning , 2000, IEEE Trans. Software Eng..

[47]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[48]  Richard W. Selby,et al.  Enabling reuse-based software development of large-scale systems , 2005, IEEE Transactions on Software Engineering.

[49]  Will Venters,et al.  Software engineering: theory and practice , 2006 .

[50]  Lionel E. Deimel The uses of program reading , 1985, SGCS.

[51]  Gail C. Murphy,et al.  Coping with an open bug repository , 2005, eclipse '05.

[52]  E. P. Schan,et al.  Recommended C Style and Coding Standards , 1997 .

[53]  Dirk Grunwald,et al.  Evidence-based static branch prediction using machine learning , 1997, TOPL.

[54]  Anand R. Tripathi,et al.  Issues with Exception Handling in Object-Oriented Systems , 1997, ECOOP.

[55]  Tom Cargill Exception handling: a false sense of security , 1996 .

[56]  Claire Le Goues,et al.  Specification Mining with Few False Positives , 2009, TACAS.

[57]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[58]  Wendy G. Lehnert,et al.  Using Decision Trees for Coreference Resolution , 1995, IJCAI.

[59]  David G. Novick,et al.  What users say they want in documentation , 2006, SIGDOC '06.

[60]  Greg Nelson,et al.  Extended static checking for Java , 2002, PLDI '02.

[61]  Manuvir Das,et al.  Unification-based pointer analysis with directional assignments , 2000, PLDI '00.

[62]  Spencer Rugaber,et al.  The use of domain knowledge in program understanding , 2000, Ann. Softw. Eng..

[63]  David A. Wagner,et al.  Model Checking One Million Lines of C Code , 2004, NDSS.

[64]  Pavol Cerný,et al.  Synthesis of interface specifications for Java classes , 2005, POPL '05.

[65]  Mira Kajko-Mattsson,et al.  The state of documentation practice within corrective maintenance , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[66]  Robert Dunn,et al.  Software Defect Removal , 1984 .

[67]  Martin P. Robillard,et al.  Regaining Control of Exception Handling , 1999 .

[68]  Forrest Shull,et al.  Improving software inspections by using reading techniques , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[69]  George C. Necula,et al.  Finding and preventing run-time error handling mistakes , 2004, OOPSLA.

[70]  Monica S. Lam,et al.  Automatic extraction of object-oriented component interfaces , 2002, ISSTA '02.

[71]  Thomas M. Pigoski Practical Software Maintenance: Best Practices for Managing Your Software Investment , 1996 .

[72]  Timothy Lethbridge,et al.  The relevance of software documentation, tools and technologies: a survey , 2002, DocEng '02.

[73]  Donald E. Knuth,et al.  Literate Programming , 1984, Comput. J..

[74]  Robert L. Glass,et al.  Facts and fallacies of software engineering , 2002 .

[75]  Clay Spinuzzi,et al.  Building More Usable APIs , 1998, IEEE Softw..

[76]  Claire Cardie,et al.  Noun Phrase Coreference as Clustering , 1999, EMNLP.

[77]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[78]  Antony I. T. Rowstron,et al.  Vigilante: End-to-end containment of Internet worm epidemics , 2006, TOCS.

[79]  David Lorge Parnas,et al.  Software aging , 1994, Proceedings of 16th International Conference on Software Engineering.

[80]  Scott R. Tilley,et al.  Documentation for software engineers: what is needed to aid system understanding? , 2001, SIGDOC '01.

[81]  M. Kendall,et al.  The Problem of $m$ Rankings , 1939 .

[82]  Benjamin W. Wah,et al.  Synthetic workload generation for load-balancing experiments , 1995, IEEE Parallel & Distributed Technology: Systems & Applications.

[83]  Steven E. Stemler Practical Assessment, Research, and Evaluation Practical Assessment, Research, and Evaluation A Comparison of Consensus, Consistency, and Measurement A Comparison of Consensus, Consistency, and Measurement Approaches to Estimating Interrater Reliability Approaches to Estimating Interrater Reliabilit , 2022 .

[84]  James M. Rehg,et al.  Active learning for automatic classification of software behavior , 2004, ISSTA '04.

[85]  Claire Le Goues,et al.  Automatically finding patches using genetic programming , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[86]  Lionel M. Ni,et al.  Benchmark workload generation and performance characterization of multiprocessors , 1992, Proceedings Supercomputing '92.

[87]  Sorin Lerner,et al.  ESP: path-sensitive program verification in polynomial time , 2002, PLDI '02.

[88]  James R. Larus,et al.  Branch prediction for free , 1993, PLDI '93.

[89]  Brad A. Myers,et al.  Jadeite: improving API documentation using usage information , 2009, CHI Extended Abstracts.

[90]  Ian Sommerville,et al.  Software Engineering (7th Edition) , 2004 .

[91]  Janet Nykaza,et al.  What programmers really want: results of a needs assessment for SDK documentation , 2002, SIGDOC '02.

[92]  Nicolas Anquetil,et al.  A study of the documentation essential to software maintenance , 2005, SIGDOC '05.

[93]  Margo I. Seltzer,et al.  Dealing with disaster: surviving misbehaved kernel extensions , 1996, OSDI '96.

[94]  Giuliano Antoniol,et al.  Towards the Integration of Versioning Systems, Bug Reports and Source Code Meta-Models , 2005, SETra@ICGT.

[95]  Arie van Deursen,et al.  Discovering faults in idiom-based exception handling , 2006, ICSE '06.

[96]  Raymond P. L. Buse,et al.  A metric for software readability , 2008, ISSTA '08.

[97]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[98]  David W. Wall,et al.  Predicting program behavior using real or estimated profiles , 2004, SIGP.

[99]  Frederick P. Brooks,et al.  No Silver Bullet: Essence and Accidents of Software Engineering , 1987 .

[100]  Grace A. Lewis,et al.  Modernizing Legacy Systems - Software Technologies, Engineering Processes, and Business Practices , 2003, SEI series in software engineering.

[101]  Westley Weimer,et al.  Automatic documentation inference for exceptions , 2008, ISSTA '08.

[102]  John C. Knight,et al.  Phased inspections and their implementation , 1991, SOEN.

[103]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[104]  François Bodin,et al.  A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.

[105]  Jerome A. Rolia,et al.  A Synthetic Workload Generation Technique for Stress Testing Session-Based Systems , 2006, IEEE Transactions on Software Engineering.

[106]  Alex Groce,et al.  What Went Wrong: Explaining Counterexamples , 2003, SPIN.

[107]  Martin P. Robillard,et al.  Static analysis to support the evolution of exception structure in object-oriented systems , 2003, TSEM.

[108]  Herb Sutter,et al.  C++ Coding Standards: 101 Rules, Guidelines, and Best Practices (C++ in Depth Series) , 2004 .

[109]  Gregor Snelting,et al.  Efficient path conditions in dependence graphs , 2002, ICSE '02.

[110]  R. Gunning The Technique of Clear Writing. , 1968 .

[111]  Chin-Yew Lin,et al.  Looking for a Few Good Metrics: ROUGE and its Evaluation , 2004 .

[112]  David Hovemeyer,et al.  Evaluating and tuning a static analysis to find null pointer bugs , 2005, PASTE '05.

[113]  John B. Goodenough,et al.  Exception handling: issues and a proposed notation , 1975, CACM.