An Experimental Study on Applying Metamorphic Testing in Machine Learning Applications

Machine learning techniques have been successfully employed in various areas and, in particular, for the development of healthcare applications, aiming to support in more effective and faster diagnostics (such as cancer diagnosis). However, machine learning models may present uncertainties and errors. Errors in the training process, classification, and evaluation can generate incorrect results and, consequently, to wrong clinical decisions, reducing the professionals' confidence in the use of such techniques. Similar to other application domains, the quality should be guaranteed to produce more reliable models capable of assisting health professionals in their daily activities. Metamorphic testing can be an interesting option to validate machine learning applications. Using this testing approach is possible to define relationships that define changes to be made in the application's input data to identify faults. This paper presents an experimental study to evaluate the effectiveness of metamorphic testing to validate machine learning applications. A Machine learning application to verify breast cancer diagnostic was developed, using an available dataset composed of 569 samples whose data were taken from breast cancer images, and used as the software under test, in which the metamorphic testing was applied. The results indicate that metamorphic testing can be an alternative to support the validation of machine learning applications.

[1]  Tsong Yueh Chen,et al.  METTLE: A METamorphic Testing Approach to Assessing and Validating Unsupervised Machine Learning Systems , 2018, IEEE Transactions on Reliability.

[2]  S. Tamang,et al.  Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data , 2018, JAMA internal medicine.

[3]  Gregory W. Corder,et al.  Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach , 2009 .

[4]  Sergio Segura,et al.  A Survey on Metamorphic Testing , 2016, IEEE Transactions on Software Engineering.

[5]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[6]  Hamed Asadi,et al.  Peering Into the Black Box of Artificial Intelligence: Evaluation Metrics of Machine Learning Methods. , 2019, AJR. American journal of roentgenology.

[7]  Elaine J. Weyuker,et al.  On Testing Non-Testable Programs , 1982, Comput. J..

[8]  Florence March,et al.  2016 , 2016, Affair of the Heart.

[9]  H. D. Rombach,et al.  The Goal Question Metric Approach , 1994 .

[10]  Zongyuan Yang,et al.  Metamorphic Testing and Its Applications , 2004 .

[11]  Baowen Xu,et al.  Testing and validating machine learning classifiers by metamorphic testing , 2011, J. Syst. Softw..

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Claes Wohlin,et al.  Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[14]  John D. Kelleher,et al.  Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies , 2015 .

[15]  Werner Verhelst,et al.  An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech , 2007, Speech Commun..

[16]  Alberto Broggi,et al.  Machine learning in tracking associations with stereo vision and lidar observations for an autonomous vehicle , 2016, 2016 IEEE Intelligent Vehicles Symposium (IV).

[17]  Dimitrios I. Fotiadis,et al.  Machine learning applications in cancer prognosis and prediction , 2014, Computational and structural biotechnology journal.

[18]  Huai Liu,et al.  An innovative approach for testing bioinformatics programs using metamorphic testing , 2009, BMC Bioinformatics.

[19]  Tsong Yueh Chen,et al.  Metamorphic Testing: A New Approach for Generating Next Test Cases , 2020, ArXiv.

[20]  Zhu Yan,et al.  SUPERVISED MACHINE LEARNING APPROACHES: A SURVEY , 2015, SOCO 2015.

[21]  A. Azzouz 2011 , 2020, City.

[22]  R. P. Jagadeesh Chandra Bose,et al.  Identifying implementation bugs in machine learning based image classifiers using metamorphic testing , 2018, ISSTA.

[23]  Sarfraz Khurshid,et al.  DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[24]  Patrick D. McDaniel,et al.  Machine Learning in Adversarial Settings , 2016, IEEE Security & Privacy.

[25]  Shin Nakajima,et al.  Dataset Coverage for Testing Machine Learning Computer Programs , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).

[26]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.