Testing Machine Learning Algorithms for Balanced Data Usage

With the increased application of machine learning (ML) algorithms to decision-making processes, the question of fairness of such algorithms came into the focus. Fairness testing aims at checking whether a classifier as "learned" by an ML algorithm on some training data is biased in the sense of discriminating against some of the attributes (e.g. gender or age). Fairness testing thus targets the prediction phase in ML, not the learning phase. In this paper, we investigate fairness for the learning phase. Our definition of fairness is based on the idea that the learner should treat all data in the training set equally, disregarding issues like names or orderings of features or orderings of data instances. We term this property balanced data usage. We consequently develop a (metamorphic) testing approach called TiLe for checking balanced data usage. TiLe is applied on 14 ML classifiers taken from the scikit-learn library using 4 artificial and 9 real-world data sets for training, finding 12 of the classifiers to be unbalanced.

[1]  Yuriy Brun,et al.  Fairness testing: testing software for discrimination , 2017, ESEC/SIGSOFT FSE.

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[4]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[5]  Huai Liu,et al.  Metamorphic Testing , 2018, ACM Comput. Surv..

[6]  Aws Albarghouthi,et al.  FairSquare: probabilistic verification of program fairness , 2017, Proc. ACM Program. Lang..

[7]  Daniel Kroening,et al.  Concolic Testing for Deep Neural Networks , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[8]  Gemma C. Garriga,et al.  Permutation Tests for Studying Classifier Performance , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[9]  Tsong Yueh Chen,et al.  Metamorphic Testing: A New Approach for Generating Next Test Cases , 2020, ArXiv.

[10]  Tsong Yueh Chen,et al.  Enhancing Supervised Classifications with Metamorphic Relations , 2018, 2018 IEEE/ACM 3rd International Workshop on Metamorphic Testing (MET).

[11]  Shin Nakajima,et al.  Dataset Coverage for Testing Machine Learning Computer Programs , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).

[12]  Peter Müller,et al.  An Abstract Interpretation Framework for Input Data Usage , 2018, ESOP.

[13]  Xin-Hua Hu,et al.  Validating a deep learning framework by metamorphic testing , 2017 .

[14]  Noah J. Goodall,et al.  Can you program ethics into a self-driving car? , 2016, IEEE Spectrum.

[15]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[16]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[17]  Junfeng Yang,et al.  DeepXplore , 2019, Commun. ACM.

[18]  Gail E. Kaiser,et al.  Properties of Machine Learning Applications for Use in Metamorphic Testing , 2008, SEKE.

[19]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[20]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[21]  Sergio Segura,et al.  A Survey on Metamorphic Testing , 2016, IEEE Transactions on Software Engineering.

[22]  Yu-Fang Chen,et al.  Commutativity of Reducers , 2015, TACAS.

[23]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[24]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[25]  Rüdiger Ehlers,et al.  Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[26]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[27]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[28]  Baowen Xu,et al.  Testing and validating machine learning classifiers by metamorphic testing , 2011, J. Syst. Softw..