Learning classification models from multiple experts

Building classification models from clinical data using machine learning methods often relies on labeling of patient examples by human experts. Standard machine learning framework assumes the labels are assigned by a homogeneous process. However, in reality the labels may come from multiple experts and it may be difficult to obtain a set of class labels everybody agrees on; it is not uncommon that different experts have different subjective opinions on how a specific patient example should be classified. In this work we propose and study a new multi-expert learning framework that assumes the class labels are provided by multiple experts and that these experts may differ in their class label assessments. The framework explicitly models different sources of disagreements and lets us naturally combine labels from different human experts to obtain: (1) a consensus classification model representing the model the group of experts converge to, as well as, and (2) individual expert models. We test the proposed framework by building a model for the problem of detection of the Heparin Induced Thrombocytopenia (HIT) where examples are labeled by three experts. We show that our framework is superior to multiple baselines (including standard machine learning framework in which expert differences are ignored) and that our framework leads to both improved consensus and individual expert models.

[1]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[2]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[3]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[4]  Milos Hauskrecht,et al.  Feature importance analysis for patient management decisions , 2010, MedInfo.

[5]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[6]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[7]  Milos Hauskrecht,et al.  Modeling treatment of ischemic heart disease with partially observable Markov decision processes , 1998, AMIA.

[8]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[9]  Dit-Yan Yeung,et al.  A Convex Formulation for Learning Task Relationships in Multi-Task Learning , 2010, UAI.

[10]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[11]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[12]  Rasheed A Saad,et al.  Heparin‐induced thrombocytopenia: pathogenesis and management , 2003, British journal of haematology.

[13]  Milos Hauskrecht,et al.  Multivariate Time Series Classification with Temporal Abstractions , 2009, FLAIRS.

[14]  Gilles Clermont,et al.  Outlier detection for patient monitoring and alerting , 2013, J. Biomed. Informatics.

[15]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[16]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[17]  Yuval Shahar,et al.  Temporal Information Systems in Medicine , 2010 .

[18]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[19]  Gregory F Cooper,et al.  Conditional outlier detection for clinical alerting. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[20]  Rong Jin,et al.  Generalized Maximum Margin Clustering and Unsupervised Kernel Learning , 2006, NIPS.

[21]  Milos Hauskrecht,et al.  A Pattern Mining Approach for Classifying Multivariate Temporal Data , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[22]  Milos Hauskrecht,et al.  Mining recent temporal patterns for event detection in multivariate time series data , 2012, KDD.

[23]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[24]  Milos Hauskrecht,et al.  Planning treatment of ischemic heart disease with partially observable Markov decision processes , 2000, Artif. Intell. Medicine.

[25]  Mark W. Schmidt,et al.  Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.

[26]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[27]  James C. Bezdek,et al.  Some Notes on Alternating Optimization , 2002, AFSS.

[28]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[29]  P J Simpson,et al.  Impact of the patient population on the risk for heparin-induced thrombocytopenia. , 2000, Blood.

[30]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.