Deep Learning & Software Engineering: State of Research and Future Directions

Given the current transformative potential of research that sits at the intersection of Deep Learning (DL) and Software Engineering (SE), an NSF-sponsored community workshop was conducted in co-location with the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE'19) in San Diego, California. The goal of this workshop was to outline high priority areas for cross-cutting research. While a multitude of exciting directions for future work were identified, this report provides a general summary of the research areas representing the areas of highest priority which were discussed at the workshop. The intent of this report is to serve as a potential roadmap to guide future work that sits at the intersection of SE & DL.

[1]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[2]  Xiaowei Huang,et al.  Reachability Analysis of Deep Neural Networks with Provable Guarantees , 2018, IJCAI.

[3]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[4]  Tim Menzies,et al.  Easy over hard: a case study on deep learning , 2017, ESEC/SIGSOFT FSE.

[5]  Song Wang,et al.  Automatically Learning Semantic Features for Defect Prediction , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[6]  Lei Ma,et al.  DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Matthew B. Dwyer,et al.  Refactoring Neural Networks for Verification , 2019, ArXiv.

[8]  Nicholas A. Kraft,et al.  Exploring the use of deep learning for feature location , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[9]  Mykel J. Kochenderfer,et al.  Algorithms for Verifying Deep Neural Networks , 2019, Found. Trends Optim..

[10]  Sanjit A. Seshia,et al.  Towards Verified Artificial Intelligence , 2016, ArXiv.

[11]  Aditi Raghunathan,et al.  Certified Defenses against Adversarial Examples , 2018, ICLR.

[12]  Huan Liu,et al.  Understanding Neural Networks via Rule Extraction , 1995, IJCAI.

[13]  Judith Carlton,et al.  Verification of C Programs Using Automated Reasoning , 2007, Fifth IEEE International Conference on Software Engineering and Formal Methods (SEFM 2007).

[14]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[15]  Zohar Manna,et al.  The Temporal Logic of Reactive and Concurrent Systems , 1991, Springer New York.

[16]  Andrian Marcus,et al.  On the Vocabulary Agreement in Software Issue Descriptions , 2016, 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[17]  Harald C. Gall,et al.  Software Engineering for Machine Learning: A Case Study , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[18]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[19]  David S. Rosenblum A Practical Approach to Programming With Assertions , 1995, IEEE Trans. Software Eng..

[20]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[21]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[22]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[23]  Xiangyu Zhang,et al.  Automatic Text Input Generation for Mobile Testing , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[24]  Philip S. Yu,et al.  Improving Automatic Source Code Summarization via Deep Reinforcement Learning , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[26]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[27]  Gabriele Bavota,et al.  An Empirical Investigation into Learning Bug-Fixing Patches in the Wild via Neural Machine Translation , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[28]  Albert Oliveras,et al.  6 Years of SMT-COMP , 2012, Journal of Automated Reasoning.

[29]  Hui Liu,et al.  Deep Learning Based Feature Envy Detection , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[30]  Mykel J. Kochenderfer,et al.  The Marabou Framework for Verification and Analysis of Deep Neural Networks , 2019, CAV.

[31]  Abhinav Verma,et al.  Programmatically Interpretable Reinforcement Learning , 2018, ICML.

[32]  Ashish Tiwari,et al.  Output Range Analysis for Deep Feedforward Neural Networks , 2018, NFM.

[33]  Collin McMillan,et al.  A Neural Model for Generating Natural Language Summaries of Program Subroutines , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[34]  Lei Ma,et al.  DeepHunter: a coverage-guided fuzz testing framework for deep neural networks , 2019, ISSTA.

[35]  Albert L. Baker,et al.  JML: A Notation for Detailed Design , 1999, Behavioral Specifications of Businesses and Systems.

[36]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[37]  Miltiadis Allamanis,et al.  The adverse effects of code duplication in machine learning models of code , 2018, Onward!.

[38]  Geoff Sutcliffe,et al.  The TPTP Problem Library , 1994, Journal of Automated Reasoning.

[39]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[40]  Xiaodong Gu,et al.  Deep Code Search , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[41]  Divya Gopinath,et al.  Property Inference for Deep Neural Networks , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[42]  Clark W. Barrett,et al.  The SMT-LIB Standard Version 2.0 , 2010 .

[43]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[44]  Jane Cleland-Huang,et al.  Semantically Enhanced Software Traceability Using Deep Learning Techniques , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[45]  Claire Le Goues,et al.  Automatically finding patches using genetic programming , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[46]  Sorin Lerner,et al.  ESP: path-sensitive program verification in polynomial time , 2002, PLDI '02.

[47]  Rüdiger Ehlers,et al.  Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[48]  Denys Poshyvanyk,et al.  Machine Learning-Based Prototyping of Graphical User Interfaces for Mobile Apps , 2018, IEEE Transactions on Software Engineering.

[49]  Roderick Chapman,et al.  Are We There Yet? 20 Years of Industrial Theorem Proving with SPARK , 2014, ITP.

[50]  Daniel Kroening,et al.  Concolic Testing for Deep Neural Networks , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[51]  Ian Goodfellow,et al.  TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing , 2018, ICML.

[52]  Junfeng Yang,et al.  Efficient Formal Safety Analysis of Neural Networks , 2018, NeurIPS.

[53]  Sriram K. Rajamani,et al.  SLAM and Static Driver Verifier: Technology Transfer of Formal Methods inside Microsoft , 2004, IFM.

[54]  M A Branch,et al.  Software maintenance management , 1986 .

[55]  Weiming Xiang,et al.  Output Reachable Set Estimation and Verification for Multilayer Neural Networks , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[56]  Gail E. Kaiser,et al.  Properties of Machine Learning Applications for Use in Metamorphic Testing , 2008, SEKE.

[57]  Inderjit S. Dhillon,et al.  Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.

[58]  Junfeng Yang,et al.  Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[59]  Antonio Criminisi,et al.  Measuring Neural Net Robustness with Constraints , 2016, NIPS.

[60]  B. J. Taylor,et al.  Rule extraction as a formal method for the verification and validation of neural networks , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[61]  Lori A. Clarke,et al.  A System to Generate Test Data and Symbolically Execute Programs , 1976, IEEE Transactions on Software Engineering.

[62]  Matthew Mirman,et al.  Fast and Effective Robustness Certification , 2018, NeurIPS.

[63]  Patricia Lago,et al.  Architectural Technical Debt Identification: The Research Landscape , 2018, 2018 IEEE/ACM International Conference on Technical Debt (TechDebt).

[64]  Yibo Xue,et al.  Fine-grained Android Malware Detection based on Deep Learning , 2018, 2018 IEEE Conference on Communications and Network Security (CNS).

[65]  Edmund M. Clarke,et al.  Counterexample-guided abstraction refinement , 2003, 10th International Symposium on Temporal Representation and Reasoning, 2003 and Fourth International Conference on Temporal Logic. Proceedings..

[66]  Mario Linares Vásquez,et al.  Auto-completing bug reports for Android applications , 2015, ESEC/SIGSOFT FSE.

[67]  Andrian Marcus,et al.  Using Observed Behavior to Reformulate Queries during Text Retrieval-based Bug Localization , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[68]  Denys Poshyvanyk,et al.  SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair , 2018, IEEE Transactions on Software Engineering.

[69]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[70]  Carlos R. del-Blanco,et al.  DroNet: Learning to Fly by Driving , 2018, IEEE Robotics and Automation Letters.

[71]  Sarfraz Khurshid,et al.  Generalized Symbolic Execution for Model Checking and Testing , 2003, TACAS.

[72]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[73]  Mario Linares Vásquez,et al.  FUSION: A Tool for Facilitating and Augmenting Android Bug Reporting , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[74]  Romain Robbes,et al.  Characterizing and Understanding Development Sessions , 2007, 15th IEEE International Conference on Program Comprehension (ICPC '07).

[75]  Martin White,et al.  Deep learning code fragments for code clone detection , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[76]  Lei Ma,et al.  Coverage-Guided Fuzzing for Feedforward Neural Networks , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[77]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[78]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[79]  Claire Le Goues,et al.  A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[80]  Gerard J. Holzmann,et al.  The power of 10: rules for developing safety-critical code , 2006, Computer.

[81]  Stephan Merz,et al.  Model Checking , 2000 .

[82]  Somesh Jha,et al.  Semantic Adversarial Deep Learning , 2018, IEEE Design & Test.

[83]  W. B. Roberts,et al.  Machine Learning: The High Interest Credit Card of Technical Debt , 2014 .

[84]  Pushmeet Kohli,et al.  A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.

[85]  Armando Solar-Lezama,et al.  Verifiable Reinforcement Learning via Policy Extraction , 2018, NeurIPS.

[86]  Zhenchang Xing,et al.  Predicting semantically linkable knowledge in developer online forums via convolutional neural network , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[87]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[88]  Sijia Liu,et al.  CNN-Cert: An Efficient Framework for Certifying Robustness of Convolutional Neural Networks , 2018, AAAI.

[89]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[90]  Huai Liu,et al.  How Effectively Does Metamorphic Testing Alleviate the Oracle Problem? , 2014, IEEE Transactions on Software Engineering.

[91]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[92]  Bertrand Meyer,et al.  Applying 'design by contract' , 1992, Computer.

[93]  Pushmeet Kohli,et al.  A Unified View of Piecewise Linear Neural Network Verification , 2017, NeurIPS.

[94]  Sarfraz Khurshid,et al.  Symbolic Execution for Attribution and Attack Synthesis in Neural Networks , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[95]  Koushik Sen,et al.  Symbolic execution for software testing: three decades later , 2013, CACM.

[96]  Sarfraz Khurshid,et al.  Symbolic execution for software testing in practice: preliminary assessment , 2011, 2011 33rd International Conference on Software Engineering (ICSE).