From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis

This paper describes some of the challenges and opportunities when deploying static and dynamic analysis at scale, drawing on the authors' experience with the Infer and Sapienz Technologies at Facebook, each of which started life as a research-led start-up that was subsequently deployed at scale, impacting billions of people worldwide. The paper identifies open problems that have yet to receive significant attention from the scientific community, yet which have potential for profound real world impact, formulating these as research questions that, we believe, are ripe for exploration and that would make excellent topics for research projects. Note: This paper accompanies the authors' joint keynote at the 18th IEEE International Working Conference on Source Code Analysis and Manipulation, September 23rd-24th, 2018 - Madrid, Spain.

[1]  J. Voas,et al.  Software Testability: The New Verification , 1995, IEEE Softw..

[2]  C. A. R. HOARE,et al.  An axiomatic basis for computer programming , 1969, CACM.

[3]  Mark Harman,et al.  Automated software transplantation , 2015, ISSTA.

[4]  Mark Harman,et al.  Automated web application testing using search based software engineering , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[5]  Rami Bahsoon,et al.  Empirical comparison of regression test selection algorithms , 2001, J. Syst. Softw..

[6]  Mark Harman Making the Case for MORTO: Multi Objective Regression Test Optimization , 2011, 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops.

[7]  Mark Harman,et al.  VADA: a transformation-based system for variable dependence analysis , 2002, Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation.

[8]  Lars Grunske,et al.  Semantic Program Repair Using a Reference Implementation , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[9]  Fan Wu,et al.  Memory mutation testing , 2017, Inf. Softw. Technol..

[10]  Koushik Sen,et al.  Symbolic execution for software testing: three decades later , 2013, CACM.

[11]  Mark Harman,et al.  ORBS: language-independent program slicing , 2014, SIGSOFT FSE.

[12]  Peter W. O'Hearn,et al.  Compositional Shape Analysis by Means of Bi-Abduction , 2011, JACM.

[13]  David Clark,et al.  Squeeziness: An information theoretic measure for avoiding fault masking , 2012, Inf. Process. Lett..

[14]  Peter W. O'Hearn,et al.  Scalable Shape Analysis for Systems Code , 2008, CAV.

[15]  Mark Harman,et al.  Input Domain Reduction through Irrelevant Variable Removal and Its Effect on Local, Global, and Hybrid Search-Based Structural Test Data Generation , 2012, IEEE Transactions on Software Engineering.

[16]  Mark Harman,et al.  Pareto efficient multi-objective test case selection , 2007, ISSTA '07.

[17]  John Micco,et al.  Taming Google-Scale Continuous Testing , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[18]  Dawei Qi,et al.  SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[19]  Alessandro Orso,et al.  Test-Suite Augmentation for Evolving Software , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[20]  Mark Harman,et al.  A survey of the use of crowdsourcing in software engineering , 2017, J. Syst. Softw..

[21]  Mark Harman,et al.  An analysis of the relationship between conditional entropy and failed error propagation in software testing , 2014, ICSE.

[22]  Mark Harman,et al.  A Manifesto for Higher Order Mutation Testing , 2010, 2010 Third International Conference on Software Testing, Verification, and Validation Workshops.

[23]  Zhe Dang,et al.  Entropy and software systems: towards an information-theoretic foundation of software testing , 2010, FoSER '10.

[24]  Phyllis G. Frankl,et al.  All-uses vs mutation testing: An experimental comparison of effectiveness , 1997, J. Syst. Softw..

[25]  Shinji Kusumoto,et al.  Experimental Evaluation of Program Slicing for Fault Localization , 2002, Empirical Software Engineering.

[26]  Mark Harman,et al.  We Need a Testability Transformation Semantics , 2018, SEFM.

[27]  Mithun Acharya,et al.  Practical change impact analysis based on static program slicing for industrial software systems , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[28]  Jeffrey M. Voas,et al.  COTS Software: The Economical Choice? , 1998, IEEE Softw..

[29]  Aditya P. Mathur,et al.  Interface Mutation: An Approach for Integration Testing , 2001, IEEE Trans. Software Eng..

[30]  Mark Harman,et al.  An empirical study of predicate dependence levels and trends , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[31]  Mark Harman,et al.  Transformed Vargha-Delaney Effect Size , 2015, SSBSE.

[32]  Eric Lahtinen,et al.  Automatic error elimination by horizontal code transfer across multiple applications , 2015, PLDI.

[33]  René Just,et al.  Tailored Mutants Fit Bugs Better , 2016, ArXiv.

[34]  Tibor Gyimóthy,et al.  Verifying the Concept of Union Slices on Java Programs , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[35]  Mark Harman,et al.  Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation , 2017, ACM Trans. Softw. Eng. Methodol..

[36]  Peter W. O'Hearn,et al.  Moving Fast with Software Verification , 2015, NFM.

[37]  Lionel C. Briand,et al.  A practical guide for using statistical tests to assess randomized algorithms in software engineering , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[38]  Eric Horvitz,et al.  Disruption and recovery of computing tasks: field study, analysis, and directions , 2007, CHI.

[39]  Patrick Cousot,et al.  Abstract Interpretation Frameworks , 1992, J. Log. Comput..

[40]  Mark Harman,et al.  Mutation Testing Using Genetic Algorithms: A Co-evolution Approach , 2004 .

[41]  Gordon Fraser,et al.  EvoSuite: automatic test suite generation for object-oriented software , 2011, ESEC/FSE '11.

[42]  John A. Clark,et al.  The GISMOE challenge: constructing the pareto program surface using genetic programming to find better programs (keynote paper) , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[43]  John A. Clark,et al.  Multi-objective Improvement of Software Using Co-evolution and Smart Seeding , 2008, SEAL.

[44]  Yuanyuan Zhang,et al.  A Survey of App Store Analysis for Software Engineering , 2017, IEEE Transactions on Software Engineering.

[45]  Boris Beizer,et al.  Software Testing Techniques , 1983 .

[46]  Dror G. Feitelson,et al.  Development and Deployment at Facebook , 2013, IEEE Internet Computing.

[47]  Alexander Pretschner,et al.  Security Mutants for Property-Based Testing , 2011, TAP@TOOLS.

[48]  Sriram K. Rajamani,et al.  Thorough static analysis of device drivers , 2006, EuroSys.

[49]  David M. Brooks,et al.  Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[50]  Andy Zaidman,et al.  Does Refactoring of Test Smells Induce Fixing Flaky Tests? , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[51]  Yves Le Traon,et al.  An Empirical Study on Mutation, Statement and Branch Coverage Fault Revelation That Avoids the Unreliable Clean Program Assumption , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[52]  Myra B. Cohen,et al.  Automated testing of GUI applications: Models, tools, and controlling flakiness , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[53]  Bertrand Jeannet,et al.  Apron: A Library of Numerical Abstract Domains for Static Analysis , 2009, CAV.

[54]  Phil McMinn,et al.  Search‐based software test data generation: a survey , 2004, Softw. Test. Verification Reliab..

[55]  Myra B. Cohen,et al.  Making system user interactive tests repeatable: when and what should we control? , 2015, ICSE 2015.

[56]  Mark David Weiser,et al.  Program slices: formal, psychological, and practical investigations of an automatic program abstraction method , 1979 .

[57]  Patrick Cousot,et al.  Compositional separate modular static analysis of programs by abstract interpretation , 2001 .

[58]  Mark Harman,et al.  Search Based Software Engineering: Techniques, Taxonomy, Tutorial , 2010, LASER Summer School.

[59]  Rui Abreu,et al.  Continuous test generation: enhancing continuous integration with automated test generation , 2014, ASE.

[60]  Mark Harman,et al.  Fault localization prioritization: Comparing information-theoretic and coverage-based approaches , 2013, TSEM.

[61]  Mark Harman,et al.  Sifting Through the Wreckage , 1999 .

[62]  Gregory Gay,et al.  Automated oracle creation support, or: How I learned to stop worrying about fault propagation and love mutation testing , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[63]  Ákos Hajnal,et al.  A demand‐driven approach to slicing legacy COBOL systems , 2012, J. Softw. Maintenance Res. Pract..

[64]  Ciera Jaspan,et al.  Tricorder: Building a Program Analysis Ecosystem , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[65]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[66]  Ajay S. Vinze,et al.  Barriers to adoption of software reuse: A qualitative study , 2003, Inf. Manag..

[67]  Claire Le Goues,et al.  Current challenges in automatic software repair , 2013, Software Quality Journal.

[68]  Darko Marinov,et al.  An empirical analysis of flaky tests , 2014, SIGSOFT FSE.

[69]  James R. Cordy,et al.  Comprehending reality - practical barriers to industrial adoption of software maintenance automation , 2003, 11th IEEE International Workshop on Program Comprehension, 2003..

[70]  Cristian Cadar,et al.  Targeted program transformations for symbolic execution , 2015, ESEC/SIGSOFT FSE.

[71]  Andreas Zeller,et al.  Search-based system testing: high coverage, no false alarms , 2012, ISSTA 2012.

[72]  Thomas Ball,et al.  Modular and verified automatic program repair , 2012, OOPSLA '12.

[73]  Zohar Manna,et al.  Knowledge and Reasoning in Program Synthesis , 1974, IJCAI.

[74]  Mark Harman,et al.  Pricing crowdsourcing-based software development tasks , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[75]  Mark Harman,et al.  Test data regeneration: generating new test data from existing test data , 2012, Softw. Test. Verification Reliab..

[76]  Kenneth L. McMillan,et al.  Lazy Abstraction with Interpolants , 2006, CAV.

[77]  Mark Harman,et al.  Locating dependence clusters and dependence pollution , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[78]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[79]  James R. Larus,et al.  Righting software , 2004, IEEE Software.

[80]  C. A. R. Hoare,et al.  Differential static analysis: opportunities, applications, and challenges , 2010, FoSER '10.

[81]  Mark Harman,et al.  An empirical study of amorphous slicing as a program comprehension support tool , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[82]  Phil McMinn,et al.  Search-based failure discovery using testability transformations to generate pseudo-oracles , 2009, GECCO.

[83]  Daniel Kroening,et al.  Counterexample-Guided Precondition Inference , 2013, ESOP.

[84]  Mark Harman,et al.  Evolutionary testing supported by slicing and transformation , 2002, International Conference on Software Maintenance, 2002. Proceedings..

[85]  A. M. Turing,et al.  Checking a large routine , 1989 .

[86]  Michael D. Ernst,et al.  Empirically revisiting the test independence assumption , 2014, ISSTA 2014.

[87]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.

[88]  Glenford J. Myers,et al.  Art of Software Testing , 1979 .

[89]  Fan Wu,et al.  Deep Parameter Optimisation , 2015, GECCO.

[90]  John Darlington,et al.  A system which automatically improves programs , 1973, Acta Informatica.

[91]  William G. Griswold,et al.  Effective whole-program analysis in the presence of pointers , 1998, SIGSOFT '98/FSE-6.

[92]  Susan L. Gerhart,et al.  Correctness-preserving program transformations , 1975, POPL '75.

[93]  Myra B. Cohen,et al.  An orchestrated survey of methodologies for automated software test case generation , 2013, J. Syst. Softw..

[94]  Eric Ries,et al.  The lean startup : how constant innovation creates radically successful businesses , 2011 .

[95]  Mark Harman,et al.  Crowd intelligence enhances automated mobile testing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[96]  Mark Harman,et al.  Ieee Transactions on Evolutionary Computation 1 , 2022 .

[97]  Ke Mao Multi-objective search-based mobile testing , 2017 .

[98]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1988, SIGP.

[99]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[100]  Mark Harman,et al.  The Oracle Problem in Software Testing: A Survey , 2015, IEEE Transactions on Software Engineering.

[101]  Mark Harman,et al.  CROP: Linking Code Reviews to Source Code Changes , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[102]  Mark Harman,et al.  Program comprehension assisted by slicing and transformation , 1995 .

[103]  Patrick Cousot,et al.  Automatic Inference of Necessary Preconditions , 2013, VMCAI.

[104]  Peter W. O'Hearn,et al.  Continuous Reasoning: Scaling the impact of formal methods , 2018, LICS.

[105]  Edmund M. Clarke,et al.  Counterexample-guided abstraction refinement , 2003, 10th International Symposium on Temporal Representation and Reasoning, 2003 and Fourth International Conference on Temporal Logic. Proceedings..

[106]  Gordon Fraser,et al.  The Seed is Strong: Seeding Strategies in Search-Based Software Testing , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[107]  Mark Harman,et al.  Genetic Improvement of Software: A Comprehensive Survey , 2018, IEEE Transactions on Evolutionary Computation.

[108]  Mark Harman,et al.  Regression Testing Minimisation, Selection and Prioritisation - A Survey , 2009 .

[109]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[110]  Nikolai Tillmann,et al.  Demand-Driven Compositional Symbolic Execution , 2008, TACAS.

[111]  Christopher Strachey,et al.  A theory of programming language semantics , 1976 .

[112]  Yuanyuan Zhang,et al.  Achievements, Open Problems and Challenges for Search Based Software Testing , 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST).

[113]  David W. Binkley,et al.  A large-scale empirical study of forward and backward static slice size and context sensitivity , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[114]  Per Runeson,et al.  A systematic review on regression test selection techniques , 2010, Inf. Softw. Technol..

[115]  Peter W. O'Hearn,et al.  Shape Analysis for Composite Data Structures , 2007, CAV.

[116]  Mark Harman,et al.  Deploying Search Based Software Engineering with Sapienz at Facebook , 2018, SSBSE.

[117]  Yuming Zhou,et al.  An empirical study on dependence clusters for effort-aware fault-proneness prediction , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[118]  Mark Harman,et al.  The impact of input domain reduction on search-based test data generation , 2007, ESEC-FSE '07.

[119]  Peter W. O'Hearn,et al.  Local Reasoning about Programs that Alter Data Structures , 2001, CSL.

[120]  Mark Harman,et al.  Testability transformation , 2004, IEEE Transactions on Software Engineering.

[121]  Yuanyuan Zhang,et al.  Search-based software engineering: Trends, techniques and applications , 2012, CSUR.

[122]  Gregg Rothermel,et al.  An empirical study of regression test selection techniques , 1998, Proceedings of the 20th International Conference on Software Engineering.

[123]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[124]  Ofer Strichman,et al.  Regression verification: proving the equivalence of similar programs , 2013, Softw. Test. Verification Reliab..

[125]  Sriram K. Rajamani,et al.  Compositional may-must program analysis: unleashing the power of alternation , 2010, POPL '10.

[126]  Yuriy Brun,et al.  The plastic surgery hypothesis , 2014, SIGSOFT FSE.

[127]  Sarfraz Khurshid,et al.  Symbolic execution for software testing in practice: preliminary assessment , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[128]  William E. Riddle,et al.  Software technology maturation , 1985, ICSE '85.

[129]  Mark Harman,et al.  Using Genetic Improvement and Code Transplants to Specialise a C++ Program to a Problem Class , 2014, EuroGP.

[130]  Yue Jia,et al.  Sapienz: multi-objective automated testing for Android applications , 2016, ISSTA.

[131]  Sumit Gulwani,et al.  Spreadsheet data manipulation using examples , 2012, CACM.