Dealing with Expert Bias in Collective Decision-Making

Quite some real-world problems can be formulated as decision-making problems wherein one must repeatedly make an appropriate choice from a set of alternatives. Multiple expert judgements, whether human or artificial, can help in taking correct decisions, especially when exploration of alternative solutions is costly. As expert opinions might deviate, the problem of finding the right alternative can be approached as a collective decision making problem (CDM) via aggregation of independent judgements. Current state-of-the-art approaches focus on efficiently finding the optimal expert, and thus perform poorly if all experts are not qualified or if they are overly biased, thereby potentially derailing the decision-making process. In this paper, we propose a new algorithmic approach based on contextual multi-armed bandit problems (CMAB) to identify and counteract such biased expertise. We explore homogeneous, heterogeneous and polarised expert groups and show that this approach is able to effectively exploit the collective expertise, outperforming state-of-the-art methods, especially when the quality of the provided expertise degrades. Our novel CMAB-inspired approach achieves a higher final performance and does so while converging more rapidly than previous adaptive algorithms. ∗Corresponding author Email addresses: axel.abels@ulb.be (Axel Abels), tom.lenaerts@ulb.be (Tom Lenaerts), vito.trianni@istc.cnr.it (Vito Trianni), ann.nowe@vub.be (Ann Nowé) ar X iv :2 10 6. 13 53 9v 2 [ cs .A I] 2 9 A ug 2 02 2

[1]  R. Srikant,et al.  Improved Algorithms for Misspecified Linear Markov Decision Processes , 2021, AISTATS.

[2]  David Simchi-Levi,et al.  Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability , 2020, Math. Oper. Res..

[3]  Sanath Kumar Krishnamurthy,et al.  Adapting to misspecification in contextual bandits with offline regression oracles , 2021, ICML.

[4]  Elad Hazan,et al.  Boosting for Online Convex Optimization , 2021, ICML.

[5]  T. Tabuchi,et al.  Coronavirus Disease , 2021, Encyclopedia of the UN Sustainable Development Goals.

[6]  Ken-ichi Kawarabayashi,et al.  A Parameter-Free Algorithm for Misspecified Linear Contextual Bandits , 2021, AISTATS.

[7]  Ann Nowé,et al.  How Expert Confidence Can Improve Collective Decision-Making in Contextual Multi-Armed Bandit Problems , 2020, ICCCI.

[8]  Csaba Szepesvari,et al.  Bandit Algorithms , 2020 .

[9]  L. Jonung Sweden's Constitution Decides Its COVID-19 Exceptionalism , 2020, SSRN Electronic Journal.

[10]  T. Hollingsworth,et al.  How will country-based mitigation measures influence the course of the COVID-19 epidemic? , 2020, The Lancet.

[11]  Alexander Rakhlin,et al.  Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles , 2020, ICML.

[12]  Ann Nowé,et al.  Collective Decision-Making as a Contextual Multi-armed Bandit Problem , 2020, ICCCI.

[13]  Djallel Bouneffouf,et al.  A Survey on Practical Applications of Multi-Armed and Contextual Bandits , 2019, ArXiv.

[14]  Michael Lawrence Barnett,et al.  Comparative Accuracy of Diagnosis by Collective Intelligence of Multiple Physicians vs Individual Physicians , 2019, JAMA network open.

[15]  Marcus A. Badgeley,et al.  Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study , 2018, PLoS medicine.

[16]  E. O’Sullivan,et al.  Cognitive Bias in Clinical Medicine , 2018, The journal of the Royal College of Physicians of Edinburgh.

[17]  J. Pickles An essential guide , 2018, Dental Nursing.

[18]  Itiel E. Dror,et al.  When Expert Decision Making Goes Wrong: Consensus, Bias, the Role of Experts, and Accuracy , 2018 .

[19]  Haipeng Luo,et al.  Practical Contextual Bandits with Regression Oracles , 2018, ICML.

[20]  Dan Bang,et al.  Making better decisions in groups , 2017, Royal Society Open Science.

[21]  Stefan M. Herzog,et al.  The Potential of Collective Intelligence in Emergency Medicine: Pooling Medical Students’ Independent Decisions Improves Diagnostic Performance , 2017, Medical decision making : an international journal of the Society for Medical Decision Making.

[22]  Aditya Gopalan,et al.  Misspecified Linear Bandits , 2017, AAAI.

[23]  Kevin Chagin,et al.  The Wisdom of Crowds of Doctors , 2016, Medical decision making : an international journal of the Society for Medical Decision Making.

[24]  Peter A. Flach,et al.  Advances in Neural Information Processing Systems 28 , 2015 .

[25]  J. Greenberg,et al.  Understanding Psychological Reactance , 2015, Zeitschrift fur Psychologie.

[26]  Li Zhou,et al.  A Survey on Contextual Multi-armed Bandits , 2015, ArXiv.

[27]  Ralf H. J. M. Kurvers,et al.  Collective Intelligence Meets Medical Decision-Making: The Collective Outperforms the Best Radiologist , 2015, PloS one.

[28]  Haipeng Luo,et al.  Online Gradient Boosting , 2015, NIPS.

[29]  Gergely Neu,et al.  Explore no more: Improved high-probability regret bounds for non-stochastic bandits , 2015, NIPS.

[30]  Pierre-Yves Oudeyer,et al.  Multi-Armed Bandits for Intelligent Tutoring Systems , 2013, EDM.

[31]  John Langford,et al.  Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.

[32]  Nello Cristianini,et al.  Finite-Time Analysis of Kernelised Contextual Bandits , 2013, UAI.

[33]  John Langford,et al.  Contextual Bandit Learning with Predictable Rewards , 2012, AISTATS.

[34]  Wei Chu,et al.  Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.

[35]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[36]  John Langford,et al.  Contextual Bandit Algorithms with Supervised Learning Guarantees , 2010, AISTATS.

[37]  S. J. Whitehead,et al.  Health outcomes in economic evaluation: the QALY and utilities. , 2010, British medical bulletin.

[38]  D. Rew,et al.  Collective wisdom and decision making in surgical oncology. , 2010, European journal of surgical oncology : the journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology.

[39]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[40]  John Langford,et al.  An Optimal High Probability Algorithm for the Contextual Bandit Problem , 2010, ArXiv.

[41]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[42]  Jean-Yves Audibert,et al.  Risk bounds in linear regression through PAC-Bayesian truncation , 2009, 0902.1733.

[43]  C. Buckley,et al.  Overview of the TREC 2010 Relevance Feedback Track ( Notebook ) , 2010 .

[44]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[45]  Matthew J. Streeter,et al.  Tighter Bounds for Multi-Armed Bandits with Expert Advice , 2009, COLT.

[46]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[47]  B. Sinha,et al.  Statistical Meta-Analysis with Applications , 2008 .

[48]  J. Langford,et al.  The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.

[49]  H. Vincent Poor,et al.  Bandit problems with side observations , 2005, IEEE Transactions on Automatic Control.

[50]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[51]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[52]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[53]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[54]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[55]  Ken Perlin,et al.  An image synthesizer , 1988 .

[56]  J. Baron,et al.  Outcome bias in decision evaluation. , 1988, Journal of personality and social psychology.

[57]  David E. Tyler,et al.  Maximum likelihood estimation for the wrapped Cauchy distribution , 1988 .

[58]  G. Aikenhead,et al.  Collective Decision Making in the Social Context of Science. , 1985 .

[59]  K. T. Poole,et al.  The Polarization of American Politics , 1984, The Journal of Politics.

[60]  G. Owen,et al.  Thirteen theorems in search of the truth , 1983 .

[61]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .