Bayesian Model Selection in Social Research

It is argued that P-values and the tests based upon them give unsatisfactory results, especially in large samples. It is shown that, in regression, when there are many candidate independent variables, standard variable selection procedures can give very misleading results. Also, by selecting a single model, they ignore model uncertainty and so underestimate the uncertainty about quantities of interest. The Bayesian approach to hypothesis testing, model selection, and accounting for model uncertainty is presented. Implementing this is straightforward through the use of the simple and accurate BIC approximation, and it can be done using the output from standard software. Specific results are presented for most of the types of model commonly used in sociology. It is shown that this approach overcomes the difficulties with P-values and standard model selection procedures based on them. It also allows easy comparison of nonnested models, and permits the quantification of the evidence for a null hypothesis of interest, such as a convergence theory or a hypothesis about societal norms.

[1]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[2]  H. Jeffreys Some Tests of Significance, Treated by the Theory of Probability , 1935, Mathematical Proceedings of the Cambridge Philosophical Society.

[3]  Reinhard Bendix,et al.  Social Mobility in Industrial Society , 1959 .

[4]  D. Cox Tests of Separate Families of Hypotheses , 1961 .

[5]  David R. Cox,et al.  Further Results on Tests of Separate Families of Hypotheses , 1962 .

[6]  Ward Edwards,et al.  Bayesian statistical inference for psychological research. , 1963 .

[7]  George J. Stigler,et al.  The Optimum Enforcement of Laws , 1970, Journal of Political Economy.

[8]  Denton E. Morrison,et al.  The Significance Test Controversy , 1972 .

[9]  I. Ehrlich Participation in Illegitimate Activities: A Theoretical and Empirical Investigation , 1973, Journal of Political Economy.

[10]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[11]  Maurice A. Garnier,et al.  OCCUPATIONAL MOBILITY IN INDUSTRIAL SOCIETIES: A COMPARATIVE ANALYSIS OF DIFFERENTIAL ACCESS TO OCCUPATIONAL RANKS IN SEVENTEEN COUNTRIES * , 1976 .

[12]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[13]  Edward E. Leamer,et al.  Specification Searches: Ad Hoc Inference with Nonexperimental Data , 1980 .

[14]  S. Fienberg,et al.  Identification and estimation of age-period-cohort models in the analysis of discrete archival data , 1979 .

[15]  S. Weisberg Applied Linear Regression , 1981 .

[16]  S. Fienberg,et al.  Recent Econometric Modeling of Crime and Punishment , 1980 .

[17]  J. Richard,et al.  Specification Searches: Ad Hoc Inference with Nonexperimental Data , 1980 .

[18]  T. Hassard,et al.  Applied Linear Regression , 2005 .

[19]  D. Freedman A Note on Screening Regression Equations , 1983 .

[20]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[21]  R. Hauser,et al.  COMPARATIVE SOCIAL MOBILITY REVISITED: MODELS OF CONVERGENCE AND DIVERGENCE IN 16 COUNTRIES* , 1984 .

[22]  M. Hout Status, Autonomy, and Training in Occupational Mobility , 1984, American Journal of Sociology.

[23]  A. Raftery Choosing Models for Cross-Classifications , 1986 .

[24]  L. Tierney,et al.  Accurate Approximations for Posterior Moments and Marginal Densities , 1986 .

[25]  J. Berger,et al.  Testing Precise Hypotheses , 1987 .

[26]  N. D. Pidgen,et al.  The Comparative Method , 1987 .

[27]  J. Berger,et al.  Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .

[28]  P. Westfall,et al.  The Power Function of Conditional Log-Linear Model Tests , 1988 .

[29]  D. Freedman,et al.  On the Impact of Variable Selection in Fitting Regression Equations , 1988 .

[30]  Michael Hout,et al.  More Universalism, Less Structural Mobility: The American Occupational Structure in the 1980s , 1988, American Journal of Sociology.

[31]  Kenneth A. Bollen,et al.  Structural Equations with Latent Variables , 1989 .

[32]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[33]  L. Joseph,et al.  Bayesian Statistics: An Introduction , 1989 .

[34]  D. Johnstone Interpreting Statistical Insignificance: A Bayesian Perspective , 1990 .

[35]  T. Fearn,et al.  Bayesian statistics : principles, models, and applications , 1990 .

[36]  Adrian E. Raftery,et al.  Bayesian Model Selection in Structural Equation Models , 1992 .

[37]  Alan J. Miller Subset Selection in Regression , 1992 .

[38]  D. Weakliem,et al.  Ownerships and authority in the earnings function: nonnested tests of alternative specifications , 1993 .

[39]  A. Raftery,et al.  Event history modeling of world fertility survey data , 1996 .

[40]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[41]  Yu-wei Xie,et al.  Log-Multiplicative Models for Discrete-Time, Discrete-Covariate Event-History Data , 1994 .

[42]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[43]  A. Raftery Approximate Bayes factors and accounting for model uncertainty in generalised linear models , 1996 .