Automated Large-Scale Multi-Language Dynamic Program Analysis in the Wild (Tool Insights Paper)

Today’s availability of open-source software is overwhelming, and the number of free, ready-to-use software components in package repositories such as NPM, Maven, or SBT is growing exponentially. In this paper we address two straightforward yet important research questions: would it be possible to develop a tool to automate dynamic program analysis on public open-source software at a large scale? Moreover, and perhaps more importantly, would such a tool be useful? We answer the first question by introducing NAB, a tool to execute large-scale dynamic program analysis of open-source software in the wild. NAB is fully-automatic, language-agnostic, and can scale dynamic program analyses on open-source software up to thousands of projects hosted in code repositories. Using NAB, we analyzed more than 56K Node.js, Java, and Scala projects. Using the data collected by NAB we were able to (1) study the adoption of new language constructs such as JavaScript Promises, (2) collect statistics about bad coding practices in JavaScript, and (3) identify Java and Scala task-parallel workloads suitable for inclusion in a domain-specific benchmark suite. We consider such findings and the collected data an affirmative answer to the second question. 2012 ACM Subject Classification Software and its engineering → Dynamic analysis

[1]  Westley Weimer,et al.  Automatically documenting program changes , 2010, ASE.

[2]  James E. Smith,et al.  Modeling superscalar processors via statistical simulation , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[3]  Lizy K. John,et al.  The Case for Automatic Synthesis of Miniature Benchmarks , 2005 .

[4]  Gail C. Murphy,et al.  Summarizing software artifacts: a case study of bug reports , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[5]  Murali Krishna Ramanathan,et al.  Efficient flow profiling for detecting performance bugs , 2016, ISSTA.

[6]  ShenXipeng,et al.  Finding the limit , 2014 .

[7]  Lieven Eeckhout,et al.  Performance analysis through synthetic trace generation , 2000, 2000 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS (Cat. No.00EX422).

[8]  Haiyang Sun,et al.  AutoBench: Finding Workloads That You Need Using Pluggable Hybrid Analyses , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[9]  Lieven Eeckhout,et al.  The Return of Synthetic Benchmarks , 2008 .

[10]  Haiyang Sun,et al.  Efficient dynamic analysis for Node.js , 2018, CC.

[11]  Christian Wimmer,et al.  Practical partial evaluation for high-performance dynamic language runtimes , 2017, PLDI.

[12]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[13]  Andreas Krause,et al.  Predicting Program Properties from "Big Code" , 2015, POPL.

[14]  Laurence Tratt,et al.  Storage strategies for collections in dynamically typed languages , 2013, OOPSLA.

[15]  Frank Tip,et al.  A model for reasoning about JavaScript promises , 2017, Proc. ACM Program. Lang..

[16]  Brian A. Wichmann,et al.  A Synthetic Benchmark , 1976, Comput. J..

[17]  Cristian Cadar,et al.  Covrig: a framework for the analysis of code, test, and coverage evolution in real software , 2014, ISSTA 2014.

[18]  Georgios Gousios,et al.  The GHTorent dataset and tool suite , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[19]  Grigore Rosu,et al.  JavaMOP: Efficient parametric runtime monitoring framework , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[20]  Grigore Rosu,et al.  How good are the specs? A study of the bug-finding effectiveness of existing Java API specifications , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[21]  Koushik Sen,et al.  Jalangi: a selective record-replay and dynamic analysis framework for JavaScript , 2013, ESEC/FSE 2013.

[22]  Michael D. Ernst,et al.  Feedback-Directed Random Test Generation , 2007, 29th International Conference on Software Engineering (ICSE'07).

[23]  Gordon Fraser,et al.  Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[24]  William G. Griswold,et al.  An Overview of AspectJ , 2001, ECOOP.

[25]  Frederic T. Chong,et al.  HLS: combining statistical and symbolic simulation to guide microprocessor designs , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[26]  David Bernstein,et al.  Containers and Cloud: From LXC to Docker to Kubernetes , 2014, IEEE Cloud Computing.

[27]  Gang Yin,et al.  Does the Role Matter? An Investigation of the Code Quality of Casual Contributors in GitHub , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).

[28]  Jens Dietrich,et al.  Contracts in the Wild: A Study of Java Programs , 2017, ECOOP.

[29]  L.A. Smith,et al.  A Parallel Java Grande Benchmark Suite , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[30]  Reinhold Weicker,et al.  Dhrystone: a synthetic systems programming benchmark , 1984, CACM.

[31]  Lieven Eeckhout,et al.  Benchmark synthesis for architecture and compiler exploration , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).

[32]  Mira Mezini,et al.  Da capo con scala: design and analysis of a scala benchmark suite for the java virtual machine , 2011, OOPSLA '11.

[33]  Walter Binder,et al.  DiSL: a domain-specific language for bytecode instrumentation , 2012, AOSD.

[34]  Yanping Wang,et al.  SPECjvm2008 Performance Characterization , 2009, SPEC Benchmark Workshop.

[35]  Jan Vitek,et al.  DéjàVu: a map of code duplicates on GitHub , 2017, Proc. ACM Program. Lang..

[36]  Xipeng Shen,et al.  Finding the limit: examining the potential and complexity of compilation scheduling for JIT-based runtime systems , 2014, ASPLOS.