Understanding, Debugging, and Optimizing Distributed Software Builds: A Design Study

Today's build systems distribute build tasks across thousands of machines, reusing cached build results whenever possible. But despite the sophisticated nature of modern build tools, the core software architecture of the system under build defines the lower bound for how fast the system can compile. Long, consecutive build chains or slow individual build targets can introduce expensive compilation bottlenecks. Further, the growing complexity of both build systems and software systems under build makes comprehending, debugging, and optimizing build performance a significant challenge faced by many software engineers. We present a design study to describe and help mitigate the cognitive challenges faced by software engineers that use modern, cached, and distributed build systems. We characterize the performance analysis process and identify the main stakeholders involved, key usage scenarios, and elicit important requirements for tool support. We propose an interactive BuildExplorer tool for understanding, optimizing, and debugging cached and distributed build sessions, justifying our design decisions among alternative solutions. Our novel solution is evaluated through usage scenario walkthroughs, iterative deployments of the tool in the field, and a user study.

[1]  Richard A. Becker,et al.  Brushing scatterplots , 1987 .

[2]  John T. Stasko,et al.  Toward a Deeper Understanding of the Role of Interaction in Information Visualization , 2007, IEEE Transactions on Visualization and Computer Graphics.

[3]  Andreas Holzinger,et al.  Usability engineering methods for software developers , 2005, CACM.

[4]  Danny Holten,et al.  Hierarchical Edge Bundles: Visualization of Adjacency Relations in Hierarchical Data , 2006, IEEE Transactions on Visualization and Computer Graphics.

[5]  Shane McIntosh,et al.  Identifying and understanding header file hotspots in C/C++ build processes , 2015, Automated Software Engineering.

[6]  Jan Pries-Heje,et al.  A Comprehensive Framework for Evaluation in Design Science Research 1 , 2022 .

[7]  Liming Zhu,et al.  Continuous Integration, Delivery and Deployment: A Systematic Review on Approaches, Tools, Challenges and Practices , 2017, IEEE Access.

[8]  Andrew P. Black,et al.  How We Refactor, and How We Know It , 2012, IEEE Trans. Software Eng..

[9]  Shane McIntosh,et al.  Automatically repairing dependency-related build breakage , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[10]  Yijun Yu,et al.  Improving the Build Architecture of Legacy C/C++ Software Systems , 2005, FASE.

[11]  Per Ola Kristensson,et al.  Aiding programmers using lightweight integrated code visualization , 2015, PLATEAU@SPLASH.

[12]  A. Inselberg,et al.  Parallel coordinates for visualizing multi-dimensional geometry , 1987 .

[13]  Martin Bichler,et al.  Design science in information systems research , 2006, Wirtschaftsinf..

[14]  Brendan Murphy Optimizing software development processes , 2016, CESI@ICSE.

[15]  Fernando Kamei,et al.  What programmers say about refactoring tools?: an empirical investigation of stack overflow , 2013, WRT '13.

[16]  Hung Viet Nguyen,et al.  Fault Localization for Make-Based Build Crashes , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[17]  Yijun Yu,et al.  Removing false code dependencies to speedup software build processes , 2003, CASCON.

[18]  Tamara Munzner,et al.  Design Study Methodology: Reflections from the Trenches and the Stacks , 2012, IEEE Transactions on Visualization and Computer Graphics.

[19]  Srikanth Kandula,et al.  CloudBuild: Microsoft's Distributed and Caching Build Service , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[20]  Foyzul Hassan,et al.  HireBuild: An Automatic Approach to History-Driven Repair of Build Scripts , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[21]  Darko Marinov,et al.  Trade-offs in continuous integration: assurance, security, and flexibility , 2017, ESEC/SIGSOFT FSE.

[22]  Kim Mens,et al.  Correct, Efficient, and Tailored: The Future of Build Systems , 2018, IEEE Software.

[23]  Brendan Murphy,et al.  The Art of Testing Less without Sacrificing Quality , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[24]  Shane McIntosh,et al.  Modern Release Engineering in a Nutshell -- Why Researchers Should Care , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[25]  Hung Viet Nguyen,et al.  Build code analysis with symbolic evaluation , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[26]  Emden R. GansnerYifan Multilevel Agglomerative Edge Bundling for Visualizing Large Graphs , 2011 .

[27]  J. David Morgenthaler,et al.  Automated Decomposition of Build Targets , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[28]  Michele Lanza CodeCrawler - polymetric views in action , 2004, Proceedings. 19th International Conference on Automated Software Engineering, 2004..

[29]  Holtzblatt Karen,et al.  Contextual Inquiry: A Participatory Technique for System Design , 2017 .

[30]  Harald C. Gall,et al.  A Tool for Visual Understanding of Source Code Dependencies , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[31]  Maria E. Orlowska,et al.  Analyzing Process Models Using Graph Reduction Techniques , 2000, Inf. Syst..

[32]  J. David Morgenthaler,et al.  Searching for build debt: Experiences managing technical debt at Google , 2012, 2012 Third International Workshop on Managing Technical Debt (MTD).

[33]  Alan R. Hevner,et al.  The Three Cycle View of Design Science , 2007, Scand. J. Inf. Syst..

[34]  Jarke J. van Wijk,et al.  Force‐Directed Edge Bundling for Graph Visualization , 2009, Comput. Graph. Forum.

[35]  Timo Hämäläinen,et al.  Dependency analysis and visualization tool for Kactus2 IP-XACT design framework , 2013, 2013 International Symposium on System on Chip (SoC).

[36]  Michael W. Godfrey,et al.  The build-time software architecture view , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[37]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[38]  Shane McIntosh,et al.  Extracting Build Changes with BUILDDIFF , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[39]  Emerson Murphy-Hill,et al.  Programmer friendly refactoring tools , 2009 .

[40]  Wolfgang De Meuter,et al.  Design recovery and maintenance of build systems , 2007, 2007 IEEE International Conference on Software Maintenance.