A Survey of Software Quality for Machine Learning Applications

Machine learning (ML) is now widespread. Traditional software engineering can be applied to the development ML applications. However, we have to consider specific problems with ML applications in therms of their quality. In this paper, we present a survey of software quality for ML applications to consider the quality of ML applications as an emerging discussion. From this survey, we raised problems with ML applications and discovered software engineering approaches and software testing research areas to solve these problems. We classified survey targets into Academic Conferences, Magazines, and Communities. We targeted 16 academic conferences on artificial intelligence and software engineering, including 78 papers. We targeted 5 Magazines, including 22 papers. The results indicated key areas, such as deep learning, fault localization, and prediction, to be researched with software engineering and testing.

[1]  Dino Sejdinovic,et al.  Testing and Learning on Distributions with Symmetric Noise Invariance , 2017, NIPS.

[2]  Junfeng Yang,et al.  DeepXplore , 2019, Commun. ACM.

[3]  David S. Prerau,et al.  Knowledge acquisition in expert system development , 1987 .

[4]  Rui Abreu,et al.  A Survey on Software Fault Localization , 2016, IEEE Transactions on Software Engineering.

[5]  Zihan Zhou,et al.  Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Nidhi Hegde,et al.  Adaptive Active Hypothesis Testing under Limited Information , 2017, NIPS.

[7]  Philip S. Thomas,et al.  Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation , 2017, NIPS.

[8]  Lei Zheng,et al.  SEVEN: Deep Semi-supervised Verification Networks , 2017, IJCAI.

[9]  Phil McMinn,et al.  Search-Based Software Testing: Past, Present and Future , 2011, 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops.

[10]  Yong Jae Lee,et al.  Identifying First-Person Camera Wearers in Third-Person Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Elias Bareinboim,et al.  Identification and Model Testing in Linear Structural Equation Models using Auxiliary Variables , 2016, ICML.

[12]  Jeremy Kong,et al.  Model Checking Multi-Agent Systems against LDLK Specifications , 2017, IJCAI.

[13]  Michael Wooldridge,et al.  Rational Verification: From Model Checking to Equilibrium Checking , 2016, AAAI.

[14]  Morten Mossige,et al.  Reinforcement learning for automatic test case prioritization and selection in continuous integration , 2017, ISSTA.

[15]  Alessio Lomuscio,et al.  Verification of Broadcasting Multi-Agent Systems against an Epistemic Strategy Logic , 2017, IJCAI.

[16]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Per Runeson,et al.  Navigating Information Overload Caused by Automated Testing - a Clustering Approach in Multi-Branch Development , 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST).

[18]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[19]  Annibale Panichella,et al.  Security Threat Identification and Testing , 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST).

[20]  山本 修一郎,et al.  Knowledge-Based Software Engineering: Proceedings of the Fifth Joint Conference on Knowledge-Based Software Engineering , 2002 .

[21]  Adriana Tapus,et al.  AI Dimensions in Software Development for Human-Robot Interaction Systems , 2014, AAAI Fall Symposia.

[22]  Nassir Navab,et al.  Real-Time 3D Model Tracking in Color and Depth on a Single CPU Core , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Kevin Bouchard,et al.  Cognitive Assistance to Meal Preparation: Design, Implementation, and Assessment in a Living Lab , 2015, AAAI Spring Symposia.

[24]  Huu Le,et al.  An Exact Penalty Method for Locally Convergent Maximum Consensus , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Shuai Wang,et al.  CBGA-ES: A Cluster-Based Genetic Algorithm with Elitist Selection for Supporting Multi-Objective Test Optimization , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[26]  Xin Li,et al.  Symbolic execution of complex program driven by machine learning based constraint solving , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[27]  Alessio Lomuscio,et al.  Verifying Fault-tolerance in Parameterised Multi-Agent Systems , 2017, IJCAI.

[28]  Miriam A. M. Capretz,et al.  MLaaS: Machine Learning as a Service , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[29]  Jens Claßen,et al.  Decidable Verification of Golog Programs over Non-Local Effect Actions , 2016, AAAI.

[30]  Andy Podgurski,et al.  Properties of Effective Metrics for Coverage-Based Statistical Fault Localization , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[31]  David C. Parkes,et al.  Automated Mechanism Design without Money via Machine Learning , 2016, IJCAI.

[32]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Suchi Saria,et al.  Reliable Decision Support using Counterfactual Models , 2017, NIPS.

[34]  Shane Legg,et al.  Deep Reinforcement Learning from Human Preferences , 2017, NIPS.

[35]  Richard Torkar,et al.  Using Exploration Focused Techniques to Augment Search-Based Software Testing: An Experimental Evaluation , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[36]  Alessio Lomuscio,et al.  Parameterised Verification of Data-aware Multi-Agent Systems , 2017, IJCAI.

[37]  Jiajun Wu,et al.  Neural Scene De-rendering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  D. Sculley,et al.  Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[39]  Sanja Fidler,et al.  Annotating Object Instances with a Polygon-RNN , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Piet Demeester,et al.  Adaptive modeling and sampling methodologies for Internet of Things applications , 2016, 2016 18th Mediterranean Electrotechnical Conference (MELECON).

[41]  Jeremiah Liu,et al.  Robust Hypothesis Test for Nonlinear Effect with Gaussian Processes , 2017, NIPS.

[42]  Li Fei-Fei,et al.  Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[44]  David S. Prerau,et al.  Knowledge Acquisition in the Development of a Large Expert System , 1987, AI Mag..

[45]  Anh Tuan Nguyen,et al.  Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports (N) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[46]  Stefan Bosse,et al.  Distributed Machine Learning with Self-Organizing Mobile Agents for Earthquake Monitoring , 2016, 2016 IEEE 1st International Workshops on Foundations and Applications of Self* Systems (FAS*W).

[47]  Yong-Jin Liu,et al.  Learning to Rank Retargeted Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Kush R. Varshney,et al.  Engineering safety in machine learning , 2016, 2016 Information Theory and Applications Workshop (ITA).

[49]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[50]  Rajesh Parekh,et al.  Designing AI at Scale to Power Everyday Life , 2017, KDD.

[51]  Iasonas Kokkinos,et al.  UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Malte Helmert,et al.  Efficient Implementation of Pattern Database Heuristics for Classical Planning , 2021, SOCS.

[53]  Miguel Correia,et al.  DEKANT: a static analysis tool that learns to detect web application vulnerabilities , 2016, ISSTA.

[54]  Marco Bozzano,et al.  Automated Verification and Tightening of Failure Propagation Models , 2016, AAAI.

[55]  Omer Tripp,et al.  Finding your way in the testing jungle: a learning approach to web security testing , 2013, ISSTA.

[56]  Lisa Leonard,et al.  Quality and Knowledge in Software Engineering , 1993, AI Mag..

[57]  Michael R. Lowry Software Engineering in the Twenty-First Century , 1992, AI Mag..

[58]  M Nithya,et al.  Automatic Speaker Verification System , 2014 .

[59]  Alexandros G. Dimakis,et al.  Model-Powered Conditional Independence Test , 2017, NIPS.

[60]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Martin White,et al.  Deep learning code fragments for code clone detection , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[62]  Hisashi Kashima,et al.  Pairwise HITS: Quality Estimation from Pairwise Comparisons in Creator-Evaluator Crowdsourcing Process , 2017, AAAI.

[63]  Jenny Hotzkow Automatically inferring and enforcing user expectations , 2017, ISSTA.

[64]  Geoffrey I. Webb,et al.  A Multiple Test Correction for Streams and Cascades of Statistical Hypothesis Tests , 2016, KDD.

[65]  Nicu Sebe,et al.  Viraliency: Pooling Local Virality , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Mark Harman,et al.  Search Based Software Engineering: Techniques, Taxonomy, Tutorial , 2010, LASER Summer School.

[67]  Reid G. Smith On the Development of Commercial Expert Systems , 1984, AI Mag..

[68]  Scott Sorensen,et al.  CATS: A Color and Thermal Stereo Benchmark , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Yu Lei,et al.  Applying combinatorial test data generation to big data applications , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[70]  Kenji Fukumizu,et al.  A Linear-Time Kernel Goodness-of-Fit Test , 2017, NIPS.

[71]  Mark Santolucito Version space learning for verification on temporal differentials , 2017, ISSTA.

[72]  Zvonimir Rakamaric,et al.  Hybrid learning: interface generation through static, dynamic, and symbolic analysis , 2013, ISSTA.

[73]  Deborah S. Katz Understanding intended behavior using models of low-level signals , 2017, ISSTA.

[74]  Declan O'Sullivan,et al.  Machine learning as a service for enabling Internet of Things and People , 2016, Personal and Ubiquitous Computing.

[75]  Andrew M. Dai,et al.  Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.

[76]  Frank E. Ritter,et al.  Applying Software Engineering to Agent Development , 2010, AI Mag..

[77]  Martin J. Wainwright,et al.  A framework for Multi-A(rmed)/B(andit) Testing with Online FDR Control , 2017, NIPS.

[78]  Xin Zhang,et al.  TFX: A TensorFlow-Based Production-Scale Machine Learning Platform , 2017, KDD.

[79]  Bernhard K. Aichernig,et al.  Model-Based Testing IoT Communication via Active Automata Learning , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[80]  Paolo Tonella,et al.  Orthogonal exploration of the search space in evolutionary test case generation , 2013, ISSTA.

[81]  Ann Nowé,et al.  Towards a White Box Approach to Automated Algorithm Design , 2016, IJCAI.

[82]  Wei Zhang,et al.  Binarized Mode Seeking for Scalable Visual Pattern Discovery , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83]  Ioana Boureanu,et al.  A Novel Symbolic Approach to Verifying Epistemic Properties of Programs , 2017, IJCAI.

[84]  Christopher Vendome,et al.  Automatically Discovering, Reporting and Reproducing Android Application Crashes , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[85]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[86]  Judy Goldsmith,et al.  Why Teaching Ethics to AI Practitioners Is Important , 2017, AAAI.

[87]  Mehdi Dastani,et al.  Verifying Existence of Resource-Bounded Coalition Uniform Strategies , 2016, IJCAI.

[88]  Gregory Gay,et al.  The Fitness Function for the Job: Search-Based Generation of Test Suites That Detect Real Faults , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[89]  Hans-Peter Seidel,et al.  Towards a Quality Metric for Dense Light Fields , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[90]  Hiroshi Ishikawa,et al.  Joint Gap Detection and Inpainting of Line Drawings , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[91]  Zhi-Hua Zhou,et al.  Graph Quality Judgement: A Large Margin Expedition , 2016, IJCAI.

[92]  Lars Grunske,et al.  A learning-to-rank based fault localization approach using likely invariants , 2016, ISSTA.