Three-Way Decision for Handling Uncertainty in Machine Learning: A Narrative Review

In this work we introduce a framework, based on three-way decision (TWD) and the trisecting-acting-outcome model, to handle uncertainty in Machine Learning (ML). We distinguish between handling uncertainty affecting the input of ML models, when TWD is used to identify and properly take into account the uncertain instances; and handling the uncertainty lying in the output, where TWD is used to allow the ML model to abstain. We then present a narrative review of the state of the art of applications of TWD in regard to the different areas of concern identified by the framework, and in so doing, we will highlight both the points of strength of the three-way methodology, and the opportunities for further research.

[1]  Xiaoping Yang,et al.  Three-Way Decisions Based on Intuitionistic Fuzzy Sets , 2017, IJCRS.

[2]  Yiyu Yao,et al.  Cost-sensitive three-way email spam filtering , 2013, Journal of Intelligent Information Systems.

[3]  Guoyin Wang,et al.  A three-way cluster ensemble approach for large-scale data , 2019, Int. J. Approx. Reason..

[4]  Federico Cabitza,et al.  Three-Way Classification: Ambiguity and Abstention in Machine Learning , 2019, IJCRS.

[5]  Georg Peters,et al.  Rough clustering utilizing the principle of indifference , 2014, Inf. Sci..

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  Decui Liang,et al.  A novel three-way decision model based on incomplete information system , 2016, Knowl. Based Syst..

[8]  Robert K. Nowicki,et al.  Rough Support Vector Machine for Classification with Interval and Incomplete Data , 2019, J. Artif. Intell. Soft Comput. Res..

[9]  Bing Zhou,et al.  Multi-class decision-theoretic rough sets , 2014, Int. J. Approx. Reason..

[10]  Jerzy W. Grzymala-Busse,et al.  Rough Set Strategies to Data with Missing Attribute Values , 2006, Foundations and Novel Approaches in Data Mining.

[11]  JingTao Yao,et al.  A three-way clustering method based on an improved DBSCAN algorithm , 2019 .

[12]  Bing Huang,et al.  Sequential three-way decision based on multi-granular autoencoder features , 2020, Inf. Sci..

[13]  Guoyin Wang,et al.  A tree-based incremental overlapping clustering method using the three-way decision theory , 2016, Knowl. Based Syst..

[14]  Yiyu Yao,et al.  CE3: A three-way clustering method based on mathematical morphology , 2018, Knowl. Based Syst..

[15]  David R. Mandel,et al.  Counterfactual and causal explanation: from early theoretical views to new frontiers , 2007 .

[16]  Kai Zhang,et al.  A three-way c-means algorithm , 2019, Appl. Soft Comput..

[17]  Yiyu Yao,et al.  A Three-Way Decision Approach to Email Spam Filtering , 2010, Canadian Conference on AI.

[18]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[19]  Fan Li,et al.  An extension to Rough c-means clustering based on decision-theoretic Rough Sets model , 2014, Int. J. Approx. Reason..

[20]  Eyke Hüllermeier,et al.  Superset Learning Based on Generalized Loss Minimization , 2015, ECML/PKDD.

[21]  Bing Huang,et al.  Sequential three-way decision and granulation for cost-sensitive face recognition , 2016, Knowl. Based Syst..

[22]  Pawan Lingras,et al.  Nonlinear classification, linear clustering, evolutionary semi-supervised three-way decisions: A comparison , 2017, 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[23]  Ivo Düntsch,et al.  Statistical techniques for rough set data analysis , 2000 .

[24]  Hamido Fujita,et al.  Updating three-way decisions in incomplete multi-scale information systems , 2019, Inf. Sci..

[25]  Weihua Xu,et al.  Generalized multi-granulation double-quantitative decision-theoretic rough set of multi-source information system , 2019, Int. J. Approx. Reason..

[26]  Huaxiong Li,et al.  Co-Training Based Sequential Three-Way Decisions for Cost-Sensitive Classification , 2019, 2019 IEEE 16th International Conference on Networking, Sensing and Control (ICNSC).

[27]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[28]  Mehran Amiri,et al.  Missing data imputation using fuzzy-rough methods , 2016, Neurocomputing.

[29]  Yiyu Yao,et al.  Interval Set Cluster Analysis: A Re-formulation , 2009, RSFDGrC.

[30]  Guoyin Wang,et al.  An active three-way clustering method via low-rank matrices for multi-view data , 2020, Inf. Sci..

[31]  Haibo Zhang,et al.  A Three-Way Decision Clustering Approach for High Dimensional Data , 2016, IJCRS.

[32]  Xin Yang,et al.  A sequential three-way approach to multi-class decision , 2019, Int. J. Approx. Reason..

[33]  Hiroshi Sakai,et al.  Rough set-based rule generation and Apriori-based rule generation from table data sets: a survey and a combination , 2019, CAAI Trans. Intell. Technol..

[34]  Min Wang,et al.  Cost-sensitive active learning through statistical methods , 2019, Inf. Sci..

[35]  Lev Reyzin,et al.  Crowdsourced PAC Learning under Classification Noise , 2019, HCOMP.

[36]  Andreas Holzinger,et al.  Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.

[37]  Heung Wong,et al.  The aggregation of multiple three-way decision spaces , 2016, Knowl. Based Syst..

[38]  Fan Min,et al.  Three-way decisions based feature fusion for Chinese irony detection , 2019, Int. J. Approx. Reason..

[39]  Yiyu Yao,et al.  Sequential three-way decisions with probabilistic rough sets , 2011, IEEE 10th International Conference on Cognitive Informatics and Cognitive Computing (ICCI-CC'11).

[40]  Ben Taskar,et al.  Learning from Partial Labels , 2011, J. Mach. Learn. Res..

[41]  Xiaodong Yue,et al.  Three-way decision support for diagnosis on focal liver lesions , 2017, Knowl. Based Syst..

[42]  LiHuaxiong,et al.  Sequential three-way decision and granulation for cost-sensitive face recognition , 2016 .

[43]  Pawan Lingras,et al.  Interval Set Clustering of Web Users with Rough K-Means , 2004, Journal of Intelligent Information Systems.

[44]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[45]  John Francis Kros,et al.  Data mining and the impact of missing data , 2003, Ind. Manag. Data Syst..

[46]  Ivo Düntsch,et al.  Rough set data analysis: A road to non-invasive knowledge discovery , 2000 .

[47]  D. Hilton,et al.  The Psychology of Counterfactual Thinking , 2005 .

[48]  Ying Wang,et al.  Three-Way Decisions Method for Overlapping Clustering , 2012, RSCTC.

[49]  Fan Min,et al.  Active learning through label error statistical methods , 2020, Knowl. Based Syst..

[50]  Yiyu Yao,et al.  Three-way decisions with probabilistic rough sets , 2010, Inf. Sci..

[51]  Fan Min,et al.  Frequent pattern discovery with tri-partition alphabets , 2020, Inf. Sci..

[52]  Jiaqi Wang,et al.  A cost-sensitive three-way combination technique for ensemble learning in sentiment classification , 2019, Int. J. Approx. Reason..

[53]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[54]  Tengfei Zhang,et al.  Improved rough k-means clustering algorithm based on weighted distance measure with Gaussian function , 2017, Int. J. Comput. Math..

[55]  Yiyu Yao,et al.  An Outline of a Theory of Three-Way Decisions , 2012, RSCTC.

[56]  Andrey V. Savchenko,et al.  Sequential three-way decisions in multi-category image recognition with deep features based on distance factor , 2019, Inf. Sci..

[57]  Weihua Xu,et al.  Decision-theoretic rough set model of multi-source decision systems , 2018, Int. J. Mach. Learn. Cybern..

[58]  Hamido Fujita,et al.  On modeling similarity and three-way decision under incomplete information in rough set theory , 2020, Knowl. Based Syst..

[59]  Yiyu Yao,et al.  Three-Way Decision: An Interpretation of Rules in Rough Set Theory , 2009, RSKT.

[60]  Davide Ciucci,et al.  Orthopartitions and soft clustering: Soft mutual information measures for clustering validation , 2019, Knowl. Based Syst..

[61]  Lu Wang,et al.  Cost-Saving Effect of Crowdsourcing Learning , 2016, IJCAI.

[62]  Nouman Azam,et al.  A three-way clustering approach for handling missing data using GTRS , 2018, Int. J. Approx. Reason..

[63]  Tianxing Wang,et al.  An optimization-based formulation for three-way decisions , 2019, Inf. Sci..

[64]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[65]  Jingtao Yao,et al.  Modelling Multi-agent Three-way Decisions with Decision-theoretic Rough Sets , 2012, Fundam. Informaticae.

[66]  Yiyu Yao,et al.  Three-way decision and granular computing , 2018, Int. J. Approx. Reason..

[67]  Jitender S. Deogun,et al.  Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method , 2004, Rough Sets and Current Trends in Computing.

[68]  Andrzej Skowron,et al.  Rough sets: Some extensions , 2007, Inf. Sci..

[69]  Witold Pedrycz,et al.  Three-way decisions based on decision-theoretic rough sets under linguistic assessment with the aid of group decision making , 2015, Appl. Soft Comput..

[70]  Silvia Calegari,et al.  External Indices for Rough Clustering , 2018, IJCSR.

[71]  Davide Ciucci,et al.  Three-Way and Semi-supervised Decision Tree Learning Based on Orthopartitions , 2018, IPMU.

[72]  Bing Yu,et al.  Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering , 2013, Applied Intelligence.

[73]  Decui Liang,et al.  Incorporating logistic regression to decision-theoretic rough sets for classifications , 2014, Int. J. Approx. Reason..

[74]  Qiang Liu,et al.  Ensemble Re-clustering: Refinement of Hard Clustering by Three-Way Strategy , 2017, IScIDE.

[75]  Federico Cabitza,et al.  The elephant in the record: On the multiplicity of data recording work , 2019, Health Informatics J..

[76]  Fan Min,et al.  Tri-partition cost-sensitive active learning through kNN , 2017, Soft Computing.

[77]  Roman Słowiński,et al.  Dealing with Missing Data in Rough Set Analysis of Multi-Attribute and Multi-Criteria Decision Problems , 2000 .

[78]  G. Klir,et al.  Uncertainty-based information: Elements of generalized information theory (studies in fuzziness and soft computing). , 1998 .

[79]  A. V. Savchenko,et al.  Fast multi-class recognition of piecewise regular objects based on sequential three-way decisions and granular computing , 2016, Knowl. Based Syst..

[80]  Hong Yu,et al.  A Three-Way Decisions Clustering Algorithm for Incomplete Data , 2014, RSKT.

[81]  M. Feng,et al.  Effects of cryopreservation at –80°C on the formulation and pathogenicity of the obligate aphid pathogen Pandora nouryi , 2014 .

[82]  Yiyu Yao,et al.  Structured approximations as a basis for three-way decisions in rough set theory , 2019, Knowl. Based Syst..

[83]  Xiuyi Jia,et al.  A multiphase cost-sensitive learning method based on the multiclass three-way decision-theoretic rough set model , 2019, Inf. Sci..

[84]  Fan Min,et al.  Three-way recommender systems based on random forests , 2016, Knowl. Based Syst..

[85]  Hong-Ying Zhang,et al.  Three-way group decisions with interval-valued decision-theoretic rough sets based on aggregating inclusion measures , 2019, Int. J. Approx. Reason..

[86]  Jiaqi Wang,et al.  Three-way enhanced convolutional neural networks for sentence-level sentiment classification , 2019, Inf. Sci..

[87]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[88]  Guoyin Wang,et al.  A Semi-supervised Three-Way Clustering Framework for Multi-view Data , 2017, IJCRS.

[89]  Yiyu Yao,et al.  Pawlak’s Many Valued Information System, Non-deterministic Information System, and a Proposal of New Topics on Information Incompleteness Toward the Actual Application , 2017 .

[90]  Hong Yu,et al.  A Framework of Three-Way Cluster Analysis , 2017, IJCRS.

[91]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[92]  Bing Huang,et al.  Cost-sensitive sequential three-way decision modeling using a deep neural network , 2017, Int. J. Approx. Reason..

[93]  Bing Shi,et al.  Regression-based three-way recommendation , 2017, Inf. Sci..

[94]  Wei-Zhi Wu,et al.  Three-way concept learning based on cognitive operators: An information fusion viewpoint , 2017, Int. J. Approx. Reason..

[95]  Nouman Azam,et al.  Web-Based Medical Decision Support Systems for Three-Way Medical Decision Making With Game-Theoretic Rough Sets , 2015, IEEE Transactions on Fuzzy Systems.

[96]  Yiyu Yao,et al.  Advances in three-way decisions and granular computing , 2016, Knowl. Based Syst..

[97]  Federico Cabitza,et al.  Exploring Medical Data Classification with Three-Way Decision Trees , 2019, HEALTHINF.

[98]  Federico Cabitza,et al.  The three-way-in and three-way-out framework to treat and exploit ambiguity in data , 2020, Int. J. Approx. Reason..

[99]  Lin Yang,et al.  A method of incomplete data three-way clustering based on density peaks , 2018, ICCAD 2018.

[100]  Federico Cabitza,et al.  Ground truthing from multi-rater labeling with three-way decision and possibility theory , 2021, Inf. Sci..

[101]  Junzo Watada,et al.  NIS-Apriori-based rule generation with three-way decisions and its application system in SQL , 2020, Inf. Sci..

[102]  Baoli Wang,et al.  Multi-attribute group decision-making method based on multi-granulation weights and three-way decisions , 2020, Int. J. Approx. Reason..

[103]  Jerzy W. Grzymala-Busse,et al.  A Comparison of Several Approaches to Missing Attribute Values in Data Mining , 2000, Rough Sets and Current Trends in Computing.

[104]  Yiyu Yao,et al.  A Sequential Three-Way Approach to Constructing a Co-association Matrix in Consensus Clustering , 2018, IJCSR.

[105]  Federico Cabitza,et al.  New Frontiers in Explainable AI: Understanding the GI to Interpret the GO , 2019, CD-MAKE.

[106]  Yishay Mansour,et al.  Efficient PAC Learning from the Crowd , 2017, COLT.

[107]  Yao Li,et al.  TDUP: an approach to incremental mining of frequent itemsets with three-way-decision pattern updating , 2015, International Journal of Machine Learning and Cybernetics.

[108]  Nouman Azam,et al.  Variance based three-way clustering approaches for handling overlapping clustering , 2020, Int. J. Approx. Reason..

[109]  Tshilidzi Marwala,et al.  Rough Set Theory for the Treatment of Incomplete Data , 2007, 2007 IEEE International Fuzzy Systems Conference.