论文信息 - Unsupervised Ensemble Regression

Unsupervised Ensemble Regression

Consider a regression problem where there is no labeled data and the only observations are the predictions $f_i(x_j)$ of $m$ experts $f_{i}$ over many samples $x_j$. With no knowledge on the accuracy of the experts, is it still possible to accurately estimate the unknown responses $y_{j}$? Can one still detect the least or most accurate experts? In this work we propose a framework to study these questions, based on the assumption that the $m$ experts have uncorrelated deviations from the optimal predictor. Assuming the first two moments of the response are known, we develop methods to detect the best and worst regressors, and derive U-PCR, a novel principal components approach for unsupervised ensemble regression. We provide theoretical support for U-PCR and illustrate its improved accuracy over the ensemble mean and median on a variety of regression problems.

[1] Xi Chen,et al. Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[2] Yuval Kluger,et al. Unsupervised Ensemble Learning with Dependent Classifiers , 2015, AISTATS.

[3] John C. Platt,et al. Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[4] Alípio Mário Jorge,et al. Ensemble approaches for regression: A survey , 2012, CSUR.

[5] L. Cooper,et al. When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[6] Panagiotis G. Ipeirotis,et al. Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[7] Evan O. Paull,et al. Inferring causal molecular networks: empirical assessment through a community-based effort , 2016, Nature Methods.

[8] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[9] Yuval Kluger,et al. Estimating the accuracies of multiple classifiers without labeled data , 2014, AISTATS.

[10] Paul T. Spellman,et al. Context Specificity in Causal Signaling Networks Revealed by Phosphoprotein Profiling , 2017, Cell systems.

[11] Michael J. Pazzani,et al. A Principal Components Approach to Combining Regression Estimates , 1999, Machine Learning.

[12] Tom M. Mitchell,et al. Estimating Accuracy from Unlabeled Data: A Bayesian Approach , 2016, ICML.

[13] Yuval Kluger,et al. Ranking and combining multiple predictors without labeled data , 2013, Proceedings of the National Academy of Sciences.

[14] Javier R. Movellan,et al. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[15] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[16] J. Freidman,et al. Multivariate adaptive regression splines , 1991 .

[17] Tom M. Mitchell,et al. Estimating Accuracy from Unlabeled Data , 2014, UAI.

[18] Brent Lance,et al. Spectral meta-learner for regression (SMLR) model aggregation: Towards calibrationless brain-computer interface (BCI) , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[19] R. Tibshirani,et al. Combining Estimates in Regression and Classification , 1996 .

[20] Frédéric Lavancier,et al. A general procedure to combine estimators , 2014, Comput. Stat. Data Anal..

[21] Maher Mnif,et al. Perturbation theory of lower semi-Browder multivalued linear operators , 2011 .

[22] Y. Freund,et al. Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[23] Hadi Fanaee-T,et al. Event labeling combining ensemble detectors and background knowledge , 2014, Progress in Artificial Intelligence.

[24] David H. Wolpert,et al. Stacked generalization , 1992, Neural Networks.

[25] Ian T. Jolliffe,et al. Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[26] Gerardo Hermosillo,et al. Learning From Crowds , 2010, J. Mach. Learn. Res..

[27] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[28] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[29] B. Nadler. Finite sample approximation results for principal component analysis: a matrix perturbation approach , 2009, 0901.3245.

[30] Valen E. Johnson,et al. On Bayesian Analysis of Multirater Ordinal Data: An Application to Automated Essay Grading , 1996 .

[31] Krishnakumar Balasubramanian,et al. Unsupervised Supervised Learning I: Estimating Classification and Regression Errors without Labels , 2010, J. Mach. Learn. Res..

[32] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[33] A. Timmermann. Forecast Combinations , 2005 .

[34] Leo Breiman,et al. Stacked regressions , 2004, Machine Learning.

[35] Paulo Cortez,et al. Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[36] A. P. Dawid,et al. Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[37] J. Buxbaum,et al. A SPECTRAL APPROACH INTEGRATING FUNCTIONAL GENOMIC ANNOTATIONS FOR CODING AND NONCODING VARIANTS , 2015, Nature Genetics.