论文信息 - A Test Architecture for Machine Learning Product

A Test Architecture for Machine Learning Product

As machine learning (ML) technology continues to spread by rapid evolution, the system or service using Machine Learning technology, called ML product, makes big impact on our life, society and economy. Meanwhile, Quality Assurance (QA) for ML product is quite more difficult than hardware, non-ML software and service because performance of ML technology is much better than non-ML technology in exchange for the characteristics of ML product, e.g. low explainability. We must keep rapid evolution and reduce quality risk of ML product simultaneously. In this paper, we show a Quality Assurance Framework for Machine Learning product. Scope of QA in this paper is limited to product evaluation. First, a policy of QA for ML Product is proposed. General principles of product evaluation is introduced and applied to ML product evaluation as a part of the policy. They are composed of A-ARAI: Allowability, Achievability, Robustness, Avoidability and Improvability. A strategy of ML Product Evaluation is constructed as another part of the policy. Quality Integrity Level for ML product is also modelled. Second, we propose a test architecture of ML product testing. It consists of test levels and fundamental test types of ML product testing, including snapshot testing, learning testing and confrontation testing. Finally, we defines QA activity levels for ML product.

[1] Chris Murphy,et al. An Approach to Software Testing of Machine Learning Applications , 2007, SEKE.

[2] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[3] Baowen Xu,et al. Testing and validating machine learning classifiers by metamorphic testing , 2011, J. Syst. Softw..

[4] Min Wu,et al. Safety Verification of Deep Neural Networks , 2016, CAV.

[5] Corina S. Pasareanu,et al. DeepSafe: A Data-driven Approach for Checking Adversarial Robustness in Neural Networks , 2017, ArXiv.

[6] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[7] A. Wood,et al. Predicting Software Reliability , 1996, Computer.

[8] Antonio Criminisi,et al. Measuring Neural Net Robustness with Constraints , 2016, NIPS.

[9] Gail E. Kaiser,et al. Properties of Machine Learning Applications for Use in Metamorphic Testing , 2008, SEKE.

[10] Sara Eriksén,et al. Designing for accountability , 2002, NordiCHI '02.

[11] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[12] Pavel Laskov,et al. Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[13] Ananthram Swami,et al. The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[14] Suman Jana,et al. DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[15] Sergio Segura,et al. A Survey on Metamorphic Testing , 2016, IEEE Transactions on Software Engineering.

[16] Nancy G. Leveson,et al. Safeware: System Safety and Computers , 1995 .

[17] David L. Dill,et al. Developing Bug-Free Machine Learning Systems With Formal Mathematics , 2017, ICML.

[18] Shin Nakajima,et al. Dataset Coverage for Testing Machine Learning Computer Programs , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).

[19] Tetsuro Katayama,et al. Combinatorial Test Architecture Design Using Viewpoint Diagram , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops.

[20] Luca Pulina,et al. An Abstraction-Refinement Approach to Verification of Artificial Neural Networks , 2010, CAV.

[21] Junfeng Yang,et al. DeepXplore , 2019, Commun. ACM.

[22] Bonnie McDaniel,et al. Issues in distributed artificial intelligence , 1984, 1984 IEEE First International Conference on Data Engineering.

[23] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).