论文信息 - A Principal Components Approach to Combining Regression Estimates - 字舞流文

A Principal Components Approach to Combining Regression Estimates

The goal of combining the predictions of multiple learned models is to form an improved estimator. A combining strategy must be able to robustly handle the inherent correlation, or multicollinearity, of the learned models while identifying the unique contributions of each. A progression of existing approaches and their limitations with respect to these two issues are discussed. A new approach, PCR*, based on principal components regression is proposed to address these limitations. An evaluation of the new approach on a collection of domains reveals that (1) PCR* was the most robust combining method, (2) correlation could be handled without eliminating any of the learned models, and (3) the principal components of the learned models provided a continuum of “regularized” weights from which PCR* could choose.

Michael J. Pazzani | Christopher J. Merz | M. Pazzani | C. Merz

[1] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .

[2] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[3] Douglas C. Montgomery,et al. PREDICTION USING REGRESSION MODELS WITH MULTICOLLINEAR PREDICTOR VARIABLES , 1993 .

[4] David H. Wolpert,et al. Stacked generalization , 1992, Neural Networks.

[5] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[6] Michael J. Pazzani,et al. Classification and regression by combining models , 1998 .

[7] Charles L. Lawson,et al. Solving least squares problems , 1976, Classics in applied mathematics.

[8] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[9] Robert E. Schapire,et al. The strength of weak learnability , 1990, Mach. Learn..

[10] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[11] Han van de Waterbeemd,et al. The QSAR and Modelling Society , 1996 .

[12] J. Friedman. Multivariate adaptive regression splines , 1990 .

[13] J. Ross Quinlan,et al. Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[14] N. Draper,et al. Applied Regression Analysis , 1966 .

[15] Volker Tresp,et al. Combining Estimators Using Non-Constant Weighting Functions , 1994, NIPS.

[16] Thomas G. Dietterich,et al. Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.

[17] Leo Breiman,et al. Stacked regressions , 2004, Machine Learning.

[18] Bruce W. Schmeiser,et al. Improving model accuracy using optimal linear combinations of trained neural networks , 1995, IEEE Trans. Neural Networks.

[19] Anders Krogh,et al. Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[20] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[21] David W. Opitz,et al. Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[22] John E. Moody,et al. Fast Pruning Using Principal Components , 1993, NIPS.

[23] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[24] H. Sebastian Seung,et al. Query by committee , 1992, COLT '92.

[25] L. Cooper,et al. When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[26] C. Lawson,et al. Solving least squares problems , 1976, Classics in applied mathematics.

[27] R. Tibshirani,et al. Combining Estimates in Regression and Classification , 1996 .

[28] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[29] William H. Press,et al. The Art of Scientific Computing Second Edition , 1998 .

[30] H. Sebastian Seung,et al. Information, Prediction, and Query by Committee , 1992, NIPS.

[31] Jerome H. Friedman. Multivariate adaptive regression splines (with discussion) , 1991 .

[32] Ron Meir,et al. Bias, Variance and the Combination of Least Squares Estimators , 1994, NIPS.