Are Machine Learning Cloud APIs Used Correctly?

Machine learning (ML) cloud APIs enable developers to easily incorporate learning solutions into software systems. Unfortunately, ML APIs are challenging to use correctly and efficiently, given their unique semantics, data requirements, and accuracy-performance tradeoffs. Much prior work has studied how to develop ML APIs or ML cloud services, but not how open-source applications are using ML APIs. In this paper, we manually studied 360 representative open-source applications that use Google or AWS cloud-based ML APIs, and found 70% of these applications contain API misuses in their latest versions that degrade functional, performance, or economical quality of the software. We have generalized 8 anti-patterns based on our manual study and developed automated checkers that identify hundreds of more applications that contain ML API misuses.

[1]  Xiangyu Zhang,et al.  Correlations between deep neural network model coverage criteria and model quality , 2020, ESEC/SIGSOFT FSE.

[2]  Liqian Chen,et al.  Detecting numerical bugs in neural network architectures , 2020, ESEC/SIGSOFT FSE.

[3]  Tao Xie,et al.  A comprehensive study on challenges in deploying deep learning based software , 2020, ESEC/SIGSOFT FSE.

[4]  Miryung Kim,et al.  Is neuron coverage a meaningful measure for testing deep neural networks? , 2020, ESEC/SIGSOFT FSE.

[5]  Chao Shen,et al.  Audee: Automated Testing for Deep Learning Frameworks , 2020, 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[6]  Lei Ma,et al.  Cats Are Not Fish: Deep Learning Testing Calls for Out-Of-Distribution Awareness , 2020, 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Sasa Misailovic,et al.  Detecting flaky tests in probabilistic and machine learning applications , 2020, International Symposium on Software Testing and Analysis.

[8]  H. Wehrheim,et al.  Higher income, larger loan? monotonicity testing of machine learning models , 2020, ISSTA.

[9]  Hakjoo Oh,et al.  Effective white-box testing of deep neural networks with adaptive neuron-selection strategy , 2020, ISSTA.

[10]  Daniel Lehmann,et al.  Differential regression testing for REST APIs , 2020, ISSTA.

[11]  Paolo Tonella,et al.  Model-based exploration of the frontier of behaviours for deep learning system testing , 2020, ESEC/SIGSOFT FSE.

[12]  Wencong Xiao,et al.  An Empirical Study on Program Failures of Deep Learning Jobs , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[13]  Ashutosh Trivedi,et al.  Detecting and understanding real-world differential performance bugs in machine learning libraries , 2020, ISSTA.

[14]  Hridesh Rajan,et al.  Repairing Deep Neural Networks: Fix Patterns and Challenges , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[15]  Yang Liu,et al.  Towards Characterizing Adversarial Defects of Deep Learning Software from the Lens of Uncertainty , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[16]  Simos Gerasimou,et al.  Importance-Driven Deep Learning System Testing , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[17]  Chao Wang,et al.  ReluDiff: Differential Verification of Deep Neural Networks , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[18]  Gabriele Bavota,et al.  Taxonomy of Real Faults in Deep Learning Systems , 2019, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[19]  Sankalan Pal Chowdhury,et al.  DeepSearch: a simple and effective blackbox attack for deep neural networks , 2019, ESEC/SIGSOFT FSE.

[20]  Jie M. Zhang,et al.  Automatic Testing and Improvement of Machine Translation , 2019, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[21]  Xiaoxing Ma,et al.  Operational calibration: debugging confidence errors for DNNs in the field , 2019, ESEC/SIGSOFT FSE.

[22]  Yang Feng,et al.  DeepGini: prioritizing massive tests to enhance the robustness of deep neural networks , 2019, ISSTA.

[23]  Dag Johansen,et al.  Diggi: A Secure Framework for Hosting Native Cloud Functions with Minimal Trust , 2019, 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA).

[24]  Junichi Yamagishi,et al.  CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92) , 2019 .

[25]  Yepang Liu,et al.  How Do API Selections Affect the Runtime Performance of Data Analytics Tasks? , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[26]  Jinqiu Yang,et al.  A Study of Oracle Approximations in Testing Deep Learning Libraries , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[27]  Yijun Yu,et al.  AutoFocus: Interpreting Attention-Based Neural Networks by Code Perturbation , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[28]  Hao Zhang,et al.  Apricot: A Weight-Adaptation Approach to Fixing Deep Learning Models , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[29]  Jianjun Zhao,et al.  An Empirical Study Towards Characterizing Deep Learning Development and Deployment Across Different Frameworks and Platforms , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[30]  AWS Lambda , 2019, Machine Learning in the AWS Cloud.

[31]  Lei Ma,et al.  DeepHunter: a coverage-guided fuzz testing framework for deep neural networks , 2019, ISSTA.

[32]  Balázs Sonkoly,et al.  Towards Latency Sensitive Cloud Native Applications: A Performance Study on AWS , 2019, 2019 IEEE 12th International Conference on Cloud Computing (CLOUD).

[33]  George Kesidis,et al.  Spock: Exploiting Serverless Functions for SLO and Cost Aware Resource Procurement in Public Cloud , 2019, 2019 IEEE 12th International Conference on Cloud Computing (CLOUD).

[34]  Jeongchul Kim,et al.  Network Resource Isolation in Serverless Cloud Function Service , 2019, 2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W).

[35]  Joel Scheuner,et al.  Transpiling Applications into Optimized Serverless Orchestrations , 2019, 2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W).

[36]  Harald C. Gall,et al.  Software Engineering for Machine Learning: A Case Study , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[37]  Lin Tan,et al.  CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[38]  Ian Goodfellow,et al.  TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing , 2018, ICML.

[39]  Guido Wirtz,et al.  Cold Start Influencing Factors in Function as a Service , 2018, 2018 IEEE/ACM International Conference on Utility and Cloud Computing Companion (UCC Companion).

[40]  Olaf David,et al.  Improving Application Migration to Serverless Computing Platforms: Latency Mitigation with Keep-Alive Workloads , 2018, 2018 IEEE/ACM International Conference on Utility and Cloud Computing Companion (UCC Companion).

[41]  Frank Leymann,et al.  Modeling and Automated Deployment of Serverless Applications Using TOSCA , 2018, 2018 IEEE 11th Conference on Service-Oriented Computing and Applications (SOCA).

[42]  Miryung Kim,et al.  Data Scientists in Software Teams: State of the Art and Challenges , 2018, IEEE Transactions on Software Engineering.

[43]  Yuriy Brun,et al.  Themis: automatically testing software for discrimination , 2018, ESEC/SIGSOFT FSE.

[44]  Wen-Chuan Lee,et al.  MODE: automated neural network model debugging via state differential analysis and input selection , 2018, ESEC/SIGSOFT FSE.

[45]  Sarfraz Khurshid,et al.  DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[46]  Ion Stoica,et al.  Chameleon: scalable adaptation of video analytics , 2018, SIGCOMM.

[47]  R. P. Jagadeesh Chandra Bose,et al.  Identifying implementation bugs in machine learning based image classifiers using metamorphic testing , 2018, ISSTA.

[48]  Yifan Chen,et al.  An empirical study on TensorFlow program bugs , 2018, ISSTA.

[49]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[50]  Bartosz Balis,et al.  Challenges for Scheduling Scientific Workflows on Cloud Functions , 2018, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).

[51]  Geoffrey C. Fox,et al.  Evaluation of Production Serverless Computing Environments , 2018, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).

[52]  Lionel C. Briand,et al.  Testing Vision-Based Control Systems Using Learnable Evolutionary Algorithms , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[53]  Lei Ma,et al.  DeepMutation: Mutation Testing of Deep Learning Systems , 2018, 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE).

[54]  Shrideep Pallickara,et al.  Serverless Computing: An Investigation of Factors Influencing Microservice Performance , 2018, 2018 IEEE International Conference on Cloud Engineering (IC2E).

[55]  Cormac Toher,et al.  AFLOW-ML: A RESTful API for machine-learning predictions of materials properties , 2017, Computational Materials Science.

[56]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[57]  Jordi Pont-Tuset,et al.  The Open Images Dataset V4 , 2018, International Journal of Computer Vision.

[58]  Theo Lynn,et al.  A Preliminary Review of Enterprise Serverless Cloud Computing (Function-as-a-Service) Platforms , 2017, 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom).

[59]  Wen-Chuan Lee,et al.  LAMP: data provenance for graph based machine learning algorithms through derivative computation , 2017, ESEC/SIGSOFT FSE.

[60]  Yuriy Brun,et al.  Fairness testing: testing software for discrimination , 2017, ESEC/SIGSOFT FSE.

[61]  Vladimiro Sassone,et al.  Decentralised Runtime Monitoring for Access Control Systems in Cloud Federations , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[62]  Paul R. Brenner,et al.  Serverless Computing: Design, Implementation, and Performance , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW).

[63]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[64]  Yann-Gaël Guéhéneuc,et al.  Are REST APIs for Cloud Computing Well-Designed? An Exploratory Study , 2016, ICSOC.

[65]  Rachel K. E. Bellamy,et al.  Trials and tribulations of developers of intelligent systems: A field study , 2016, 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[66]  Miryung Kim,et al.  The Emerging Role of Data Scientists on Software Development Teams , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[67]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Luc Van Gool,et al.  DEX: Deep EXpectation of Apparent Age from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[69]  Mohak Shah,et al.  Comparative Study of Deep Learning Software Frameworks , 2015, 1511.06435.

[70]  David Maxwell Chickering,et al.  ModelTracker: Redesigning Performance Analysis Tools for Machine Learning , 2015, CHI.

[71]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[73]  Tim Kraska,et al.  MLI: An API for Distributed Machine Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[74]  Gilles Louppe,et al.  Independent consultant , 2013 .

[75]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[76]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[77]  Marcus Liwicki,et al.  IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).