Building Better Models

Analytic techniques developed for big data have much broader applications in the social sciences, outperforming standard regression models even—or rather especially—in smaller datasets. This article offers an overview of machine learning methods well-suited to social science problems, including decision trees, dimension reduction methods, nearest neighbor algorithms, support vector models, and penalized regression. In addition to novel algorithms, machine learning places great emphasis on model checking (through holdout samples and cross-validation) and model shrinkage (adjusting predictions toward the mean to reduce overfitting). This article advocates replacing typical regression analyses with two different sorts of models used in concert. A multi-algorithm ensemble approach should be used to determine the noise floor of a given dataset, while simpler methods such as penalized regression or decision trees should be used for theory building and hypothesis testing.

[1]  F. Galton Regression Towards Mediocrity in Hereditary Stature. , 1886 .

[2]  Larry M. Bartels Specification Uncertainty and Model Averaging , 1997 .

[3]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[4]  H. Gish,et al.  the Significance Test , 1989 .

[5]  Kevin A. Clarke,et al.  A Model Discipline: Political Science and the Logic of Representations , 2012 .

[6]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[7]  Michael D. Ward,et al.  Improving Predictions using Ensemble Bayesian Model Averaging , 2012, Political Analysis.

[8]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[9]  R. Tibshirani,et al.  A SIGNIFICANCE TEST FOR THE LASSO. , 2013, Annals of statistics.

[10]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[11]  G. Casella,et al.  Penalized regression, standard errors, and Bayesian lassos , 2010 .

[12]  Arkadiusz Paterek,et al.  Improving regularized singular value decomposition for collaborative filtering , 2007 .

[13]  Kevin A. Clarke,et al.  A Model Discipline , 2012 .

[14]  L. Wasserman,et al.  HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[15]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[16]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[17]  Philip A. Schrodt Seven Deadly Sins of Contemporary Quantitative Political Analysis ∗ , 2010 .

[18]  Michele Banko,et al.  Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.

[19]  Russell D. Murphy,et al.  TOWARD A NEW POLITICAL METHODOLOGY: Microfoundations and ART , 2006 .

[20]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[21]  Adrian E. Raftery,et al.  Bayesian Model Averaging , 1998 .

[22]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .