An interpretable machine learning framework for modelling human decision behavior

Machine learning has recently been widely adopted to address the managerial decision making problems. However, there is a trade-off between performance and interpretability. Full complexity models (such as neural network-based models) are non-traceable black-box, whereas classic interpretable models (such as logistic regression) are usually simplified with lower accuracy. This trade-off limits the application of state-of-the-art machine learning models in management problems, which requires high prediction performance, as well as the understanding of individual attributes' contributions to the model outcome. Multiple criteria decision aiding (MCDA) is a family of interpretable approaches to depicting the rationale of human decision behavior. It is also limited by strong assumptions (e.g. preference independence). In this paper, we propose an interpretable machine learning approach, namely Neural Network-based Multiple Criteria Decision Aiding (NN-MCDA), which combines an additive MCDA model and a fully-connected multilayer perceptron (MLP) to achieve good performance while preserving a certain degree of interpretability. NN-MCDA has a linear component (in an additive form of a set of polynomial functions) to capture the detailed relationship between individual attributes and the prediction, and a nonlinear component (in a standard MLP form) to capture the high-order interactions between attributes and their complex nonlinear transformations. We demonstrate the effectiveness of NN-MCDA with extensive simulation studies and two real-world datasets. To the best of our knowledge, this research is the first to enhance the interpretability of machine learning models with MCDA techniques. The proposed framework also sheds light on how to use machine learning techniques to free MCDA from strong assumptions.

[1]  C. Dong,et al.  Relationship of obesity to depression: a family-based study , 2004, International Journal of Obesity.

[2]  C. Ross,et al.  Overweight and depression. , 1994, Journal of health and social behavior.

[3]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[4]  S. Zionts,et al.  Theory of convex cones in multicriteria decision making , 1988 .

[5]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[6]  Bart BaesensRudy Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation , 2003 .

[7]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[8]  L. Radloff The use of the Center for Epidemiologic Studies Depression Scale in adolescents and young adults , 1991, Journal of youth and adolescence.

[9]  Salvatore Greco,et al.  Non-additive robust ordinal regression: A multiple criteria decision model based on the Choquet integral , 2010, Eur. J. Oper. Res..

[10]  Ali Fallah Tehrani,et al.  Modelling Human Decision Behaviour with Preference Learning , 2019, INFORMS J. Comput..

[11]  Amin A Gadit,et al.  Out-of-Pocket expenditure for depression among patients attending private community psychiatric clinics in Pakistan. , 2004, The journal of mental health policy and economics.

[12]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[13]  Ronald C. Kessler,et al.  Marital Status and Depression: The Importance of Coping Resources , 1982 .

[14]  John R. Hauser,et al.  Consumer Preference Axioms: Behavioral Postulates for Describing and Predicting Stochastic Choice , 1978 .

[15]  Milosz Kadzinski,et al.  Robust ordinal regression in preference learning and ranking , 2013, Machine Learning.

[16]  Renata Pelissari,et al.  SMAA methods and their applications: a literature review and future research directions , 2020, Ann. Oper. Res..

[17]  Milosz Kadzinski,et al.  Expressiveness and robustness measures for the evaluation of an additive value function in multiple criteria preference disaggregation methods: An experimental analysis , 2017, Comput. Oper. Res..

[18]  L. George,et al.  The association of age and depression among the elderly: an epidemiologic exploration. , 1991, Journal of gerontology.

[19]  Johannes Gehrke,et al.  Accurate intelligible models with pairwise interactions , 2013, KDD.

[20]  Philippe Vincke,et al.  Analysis of multicriteria decision aid in Europe , 1986 .

[21]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[22]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[23]  Nicolas Gillis,et al.  UTA-poly and UTA-splines: Additive value functions with polynomial marginals , 2016, Eur. J. Oper. Res..

[24]  Theodor J. Stewart,et al.  Use of piecewise linear value functions in interactive multicriteria decision support: a Monte Carlo study , 1993 .

[25]  John Fox,et al.  Argumentation-Based Inference and Decision Making--A Medical Perspective , 2007, IEEE Intelligent Systems.

[26]  Eric D. Ragan,et al.  A Survey of Evaluation Methods and Measures for Interpretable Machine Learning , 2018, ArXiv.

[27]  Yannis Siskos,et al.  Preference disaggregation: 20 years of MCDA experience , 2001, Eur. J. Oper. Res..

[28]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[29]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[30]  Hon-Kwong Lui,et al.  Machine Learning for Direct Marketing Response Models: Bayesian Networks with Evolutionary Programming , 2006, Manag. Sci..

[31]  Constantin Zopounidis,et al.  Multiple criteria decision aiding for finance: An updated bibliographic survey , 2015, Eur. J. Oper. Res..

[32]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[33]  C. Ross,et al.  Age and depression. , 1992, Journal of health and social behavior.

[34]  Johannes Gehrke,et al.  Intelligible models for classification and regression , 2012, KDD.

[35]  Salvatore Greco,et al.  Ordinal regression revisited: Multiple criteria ranking using a set of additive value functions , 2008, Eur. J. Oper. Res..

[36]  R B Wallace,et al.  Depressive symptoms and physical decline in community-dwelling older persons. , 1998, JAMA.

[37]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[38]  L. Pearlin,et al.  Marital status, life-strains and depression. , 1977, American sociological review.

[39]  Constantin Zopounidis,et al.  Preference disaggregation and statistical learning for multicriteria decision support: A review , 2011, Eur. J. Oper. Res..

[40]  S. Murrell,et al.  Prevalence of depression and its correlates in older adults. , 1983, American journal of epidemiology.

[41]  Satish Iyengar,et al.  Prevention of depression in at-risk adolescents: longer-term effects. , 2013, JAMA psychiatry.

[42]  S. Greco,et al.  MUSA-INT: Multicriteria customer satisfaction analysis with interacting criteria , 2014 .

[43]  Cynthia Rudin,et al.  Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[44]  George S Alexopoulos,et al.  Depression in the elderly , 2005, Lancet.

[45]  Milosz Kadzinski,et al.  Co-constructive development of a green chemistry-based model for the assessment of nanoparticles synthesis , 2018, Eur. J. Oper. Res..

[46]  T. Saaty Analytic Hierarchy Process , 2005 .

[47]  Quoc V. Le,et al.  On optimization methods for deep learning , 2011, ICML.

[48]  K. Ladin,et al.  Risk of Late-Life Depression Across 10 European Union Countries: Deconstructing the Education Effect , 2008, Journal of aging and health.

[49]  Jyrki Wallenius,et al.  Can a linear value function explain choices? An experimental study , 2012, Eur. J. Oper. Res..

[50]  Kalyanmoy Deb,et al.  Multiple Criteria Decision Making, Multiattribute Utility Theory: Recent Accomplishments and What Lies Ahead , 2008, Manag. Sci..

[51]  R. Keeney A Group Preference Axiomatization with Cardinal Utility , 1976 .

[52]  Satish Iyengar,et al.  Effect of a Cognitive-Behavioral Prevention Program on Depression 6 Years After Implementation Among At-Risk Adolescents: A Randomized Clinical Trial. , 2015, JAMA psychiatry.

[53]  Ralph E. Steuer,et al.  Multiple Criteria Decision Making, Multiattribute Utility Theory: The Next Ten Years , 1992 .

[54]  Jun Wang,et al.  A feedforward neural network for multiple criteria decision making , 1992, Comput. Oper. Res..

[55]  Milosz Kadzinski,et al.  Preference disaggregation within the regularization framework for sorting problems with multiple potentially non-monotonic criteria , 2019, Eur. J. Oper. Res..

[56]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[57]  Rema Padman,et al.  Machine Learning Approaches for Early DRG Classification and Resource Allocation , 2015, INFORMS J. Comput..

[58]  Michael R Elliott,et al.  Association of a Negative Wealth Shock With All-Cause Mortality in Middle-aged and Older Adults in the United States , 2018, JAMA.

[59]  R. L. Keeney,et al.  Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[60]  Dan Li,et al.  A meta-analysis of the prevalence of depressive symptoms in Chinese older adults. , 2014, Archives of gerontology and geriatrics.

[61]  Bart Baesens,et al.  Using Neural Network Rule Extraction and Decision Tables for Credit - Risk Evaluation , 2003, Manag. Sci..

[62]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[63]  Núria Agell,et al.  A linear programming approach for learning non-monotonic additive value functions in multiple criteria decision aiding , 2017, Eur. J. Oper. Res..

[64]  Satish Iyengar,et al.  Prevention of depression in at-risk adolescents: a randomized controlled trial. , 2009, JAMA.

[65]  Milosz Kadzinski,et al.  Predictive analytics and disused railways requalification: Insights from a Post Factum Analysis perspective , 2018, Decis. Support Syst..

[66]  David J. Curry,et al.  Prediction in Marketing Using the Support Vector Machine , 2005 .

[67]  Thomas L. Saaty,et al.  The Modern Science of Multicriteria Decision Making and Its Practical Applications: The AHP/ANP Approach , 2013, Oper. Res..

[68]  S. Zionts,et al.  Preference structure representation using convex cones in multicriteria integer programming , 1989 .