Challenges of Testing Machine Learning Applications

Machine learning applications have achieved impressive results in many areas and provided effective solution to deal with image recognition, automatic driven, voice processing etc. problems. As these applications are adopted by multiple critical areas, their reliability and robustness becomes more and more important. Software testing is a typical way to ensure the quality of applications. Approaches for testing machine learning applications are needed. This paper analyzes the characteristics of several machine learning algorithms and concludes the main challenges of testing machine learning applications. Then, multiple preliminary techniques are presented according to the challenges. Moreover, the paper demonstrates how these techniques can be used to solve the problems during the testing of machine learning applications.

[1]  Ian S. Fischer,et al.  Adversarial Transformation Networks: Learning to Generate Adversarial Examples , 2017, ArXiv.

[2]  Kajal T. Claypool,et al.  XSnippet: mining For sample code , 2006, OOPSLA '06.

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Mark Harman,et al.  The Oracle Problem in Software Testing: A Survey , 2015, IEEE Transactions on Software Engineering.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Trevor Darrell,et al.  Can you fool AI with adversarial examples on a visual Turing test? , 2017, ArXiv.

[7]  Antonio Criminisi,et al.  Measuring Neural Net Robustness with Constraints , 2016, NIPS.

[8]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[9]  Sushil Krishna Bajracharya,et al.  Mining search topics from a code search engine usage log , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[10]  A. Jefferson Offutt,et al.  Constraint-Based Automatic Test Data Generation , 1991, IEEE Trans. Software Eng..

[11]  Huai Liu,et al.  Metamorphic Testing , 2018, ACM Comput. Surv..

[12]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[13]  Baowen Xu,et al.  Testing and validating machine learning classifiers by metamorphic testing , 2011, J. Syst. Softw..

[14]  Uri Shaham,et al.  Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization , 2015, ArXiv.

[15]  Xuan Liu,et al.  A New Method for Constructing Metamorphic Relations , 2012, 2012 12th International Conference on Quality Software.

[16]  Giovanni Squillero,et al.  Automatic test program generation: a case study , 2004, IEEE Design & Test of Computers.

[17]  Peter G. Bishop,et al.  PODS — A project on diverse software , 1986, IEEE Transactions on Software Engineering.

[18]  Tsong Yueh Chen,et al.  METRIC: METamorphic Relation Identification based on the Category-choice framework , 2016, J. Syst. Softw..

[19]  Johannes Mayer,et al.  An Empirical Study on the Selection of Good Metamorphic Relations , 2006, 30th Annual International Computer Software and Applications Conference (COMPSAC'06).

[20]  Gail E. Kaiser,et al.  Properties of Machine Learning Applications for Use in Metamorphic Testing , 2008, SEKE.

[21]  Shin Nakajima,et al.  Dataset Coverage for Testing Machine Learning Computer Programs , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).

[22]  Mark Harman,et al.  Inferring automatic test oracles , 2017 .

[23]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[24]  Tsong Yueh Chen,et al.  Metamorphic Testing: A New Approach for Generating Next Test Cases , 2020, ArXiv.

[25]  Tsong Yueh Chen,et al.  On the Correlation between the Effectiveness of Metamorphic Relations and Dissimilarities of Test Case Executions , 2013, 2013 13th International Conference on Quality Software.

[26]  David Lorge Parnas,et al.  Generating a test oracle from program documentation: work in progress , 1994, ISSTA '94.

[27]  William E. Howden,et al.  Theoretical and Empirical Studies of Program Testing , 1978, IEEE Transactions on Software Engineering.

[28]  G. S. Prashanth,et al.  Increase in Modified Condition/Decision Coverage using program code transformer , 2013, 2013 3rd IEEE International Advance Computing Conference (IACC).

[29]  Lin Padgham,et al.  Model-Based Test Oracle Generation for Automated Unit Testing of Agent Systems , 2013, IEEE Transactions on Software Engineering.

[30]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[31]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[32]  Huai Liu,et al.  How Effectively Does Metamorphic Testing Alleviate the Oracle Problem? , 2014, IEEE Transactions on Software Engineering.