Subset Selection in Regression

OBJECTIVES Prediction, Explanation, Elimination or What? How Many Variables in the Prediction Formula? Alternatives to Using Subsets 'Black Box' Use of Best-Subsets Techniques LEAST-SQUARES COMPUTATIONS Using Sums of Squares and Products Matrices Orthogonal Reduction Methods Gauss-Jordan v. Orthogonal Reduction Methods Interpretation of Projections Appendix A: Operation Counts for All-Subsets Regression FINDING SUBSETS WHICH FIT WELL Objectives and Limitations of this Chapter Forward Selection Efroymson's Algorithm Backward Elimination Sequential Replacement Algorithm Replacing Two Variables at a Time Generating All Subsets Using Branch-and-Bound Techniques Grouping Variables Ridge Regression and Other Alternatives The Non-Negative Garrote and the Lasso Some Examples Conclusions and Recommendations HYPOTHESIS TESTING Is There any Information in the Remaining Variables? Is One Subset Better than Another? Appendix A: Spjftvoll's Method - Detailed Description WHEN TO STOP? What Criterion Should We Use? Prediction Criteria Cross-Validation and the PRESS Statistic Bootstrapping Likelihood and Information-Based Stopping Rules Appendix A. Approximate Equivalence of Stopping Rules ESTIMATION OF REGRESSION COEFFICIENTS Selection Bias Choice Between Two Variables Selection Bias in the General Case, and its Reduction Conditional Likelihood Estimation Estimation of Population Means Estimating Least-Squares Projections Appendix A: Changing Projections to Equate Sums of Squares BAYESIAN METHODS Bayesian Introduction 'Spike and Slab' Prior Normal prior for Regression Coefficients Model Averaging Picking the Best Model CONCLUSIONS AND SOME RECOMMENDATIONS REFERENCES INDEX