论文信息 - Physiological Indicators for User Trust in Machine Learning with Influence Enhanced Fact-Checking

Physiological Indicators for User Trust in Machine Learning with Influence Enhanced Fact-Checking

Trustworthy Machine Learning (ML) is one of significant challenges of “black-box” ML for its wide impact on practical applications. This paper investigates the effects of presentation of influence of training data points on machine learning predictions to boost user trust. A framework of fact-checking for boosting user trust is proposed in a predictive decision making scenario to allow users to interactively check the training data points with different influences on the prediction by using parallel coordinates based visualization. This work also investigates the feasibility of physiological signals such as Galvanic Skin Response (GSR) and Blood Volume Pulse (BVP) as indicators for user trust in predictive decision making. A user study found that the presentation of influences of training data points significantly increases the user trust in predictions, but only for training data points with higher influence values under the high model performance condition, where users can justify their actions with more similar facts to the testing data point. The physiological signal analysis showed that GSR and BVP features correlate to user trust under different influence and model performance conditions. These findings suggest that physiological indicators can be integrated into the user interface of AI applications to automatically communicate user trust variations in predictive decision making.

[1] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[2] Chao Li,et al. Realization of stress detection using psychophysiological signals for improvement of human-computer interactions , 2005, Proceedings. IEEE SoutheastCon, 2005..

[3] L. Richard Ye,et al. The Impact of Explanation Facilities in User Acceptance of Expert System Advice , 1995, MIS Q..

[4] Zhidong Li,et al. End-User Development for Interactive Data Analytics: Uncertainty, Correlation and User Confidence , 2018, IEEE Transactions on Affective Computing.

[5] Gregory P. Lee,et al. Different Contributions of the Human Amygdala and Ventromedial Prefrontal Cortex to Decision-Making , 1999, The Journal of Neuroscience.

[6] Abdelouahab Moussaoui,et al. Deep Learning for Plant Diseases: Detection and Saliency Map Visualisation , 2018, Human and Machine Learning.

[7] Logan Engstrom,et al. Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[8] Matthew O. Ward,et al. Nugget Browser: Visual Subgroup Mining and Statistical Significance Discovery in Multivariate Datasets , 2011, 2011 15th International Conference on Information Visualisation.

[9] Sameer Singh,et al. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[10] Yang Wang,et al. Wrapping practical problems into a machine learning framework: using water pipe failure prediction as a case study , 2017, Int. J. Intell. Syst. Technol. Appl..

[11] Bin Liang,et al. Using Convolutional Neural Networks and Transfer Learning for Bone Age Classification , 2017, 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[12] Vasant Honavar,et al. Gaining insights into support vector machine pattern classifiers using projection-based tour methods , 2001, KDD '01.

[13] Peter Funk,et al. A Case-Based Classification of Respiratory Sinus Arrhythmia , 2004, ECCBR.

[14] Erik Strumbelj,et al. Quality of classification explanations with PRBF , 2012, Neurocomputing.

[15] David Maxwell Chickering,et al. ModelTracker: Redesigning Performance Analysis Tools for Machine Learning , 2015, CHI.

[16] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.

[17] Alan Borning,et al. Integrating on-demand fact-checking with public dialogue , 2014, CSCW.

[18] Asbjørn Følstad,et al. Trust and distrust in online fact-checking services , 2017, Commun. ACM.

[19] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20] Melanie Mitchell,et al. Interpreting individual classifications of hierarchical networks , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[21] Katrien Verbert,et al. Recommender Systems for Health Informatics: State-of-the-Art and Future Perspectives , 2016, Machine Learning for Health Informatics.

[22] Shourya Roy,et al. Evolving AI from Research to Real Life - Some Challenges and Suggestions , 2018, IJCAI.

[23] Yang Wang,et al. Measurable Decision Making with GSR and Pupillary Analysis for Intelligent User Interface , 2015, ACM Trans. Comput. Hum. Interact..

[24] Fang Chen,et al. Making machine learning useable by revealing internal states update - a transparent approach , 2016, Int. J. Comput. Sci. Eng..

[25] Yang Wang,et al. Be Informed and Be Involved: Effects of Uncertainty and Correlation on User's Confidence in Decision Making , 2015, CHI Extended Abstracts.

[26] Pitoyo Hartono,et al. A transparent cancer classifier , 2018, Health Informatics J..

[27] René F. Kizilcec. How Much Information?: Effects of Transparency on Trust in an Algorithmic Interface , 2016, CHI.

[28] Henry Been-Lirn Duh,et al. BVP Feature Signal Analysis for Intelligent User Interface , 2017, CHI Extended Abstracts.

[29] Fang Chen,et al. Indexing Cognitive Load using Blood Volume Pulse Features , 2017, CHI Extended Abstracts.

[30] Dong Chen,et al. Diagnostic visualization for non-expert machine learning practitioners: A design study , 2016, 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[31] John D. Lee,et al. Trust in Automation: Designing for Appropriate Reliance , 2004 .

[32] Yang Wang,et al. Water pipe condition assessment: a hierarchical beta process approach for sparse incident data , 2014, Machine Learning.

[33] Mary Czerwinski,et al. Interactions with big data analytics , 2012, INTR.

[34] Hans-Peter Kriegel,et al. Visual classification: an interactive approach to decision tree construction , 1999, KDD '99.