Learning Classification with Auxiliary Probabilistic Information

Finding ways of incorporating auxiliary information or auxiliary data into the learning process has been the topic of active data mining and machine learning research in recent years. In this work we study and develop a new framework for classification learning problem in which, in addition to class labels, the learner is provided with an auxiliary (probabilistic) information that reflects how strong the expert feels about the class label. This approach can be extremely useful for many practical classification tasks that rely on subjective label assessment and where the cost of acquiring additional auxiliary information is negligible when compared to the cost of the example analysis and labelling. We develop classification algorithms capable of using the auxiliary information to make the learning process more efficient in terms of the sample complexity. We demonstrate the benefit of the approach on a number of synthetic and real world data sets by comparing it to the learning with class labels only.

[1]  Jeremy E. Oakley,et al.  Uncertain Judgements: Eliciting Experts' Probabilities , 2006 .

[2]  Gerardo Hermosillo,et al.  Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.

[3]  Thomas G. Dietterich,et al.  Improving SVM accuracy by training on auxiliary data sources , 2004, ICML.

[4]  Samuel Kaski,et al.  Learning from Relevant Tasks Only , 2007, ECML.

[5]  Rong Jin,et al.  Learning to Rank by Optimizing NDCG Measure , 2009, NIPS.

[6]  A. Tversky,et al.  The weighing of evidence and the determinants of confidence , 1992, Cognitive Psychology.

[7]  P J Simpson,et al.  Impact of the patient population on the risk for heparin-induced thrombocytopenia. , 2000, Blood.

[8]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[9]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[10]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[11]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[12]  Rasheed A Saad,et al.  Heparin‐induced thrombocytopenia: pathogenesis and management , 2003, British journal of haematology.

[13]  Arthur E. Hoerl,et al.  Ridge Regression — 1980: Advances, Algorithms, and Applications , 1981 .

[14]  W. Ferrell,et al.  The Hard-Easy Effect in Subjective Probability Calibration , 1996 .

[15]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[16]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[17]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[18]  T. Fearn Ridge Regression , 2013 .

[19]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[20]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[21]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[22]  Lawrence Carin,et al.  Logistic regression with an auxiliary data source , 2005, ICML.