Know your population and know your model: Using model-based regression and poststratification to generalize findings beyond the observed sample.

Psychology research often focuses on interactions, and this has deep implications for inference from nonrepresentative samples. For the goal of estimating average treatment effects, we propose to fit a model allowing treatment to interact with background variables and then average over the distribution of these variables in the population. This can be seen as an extension of multilevel regression and poststratification (MRP), a method used in political science and other areas of survey research, where researchers wish to generalize from a sparse and possibly nonrepresentative sample to the general population. In this article, we discuss areas where this method can be used in the psychological sciences. We use our method to estimate the norming distribution for the Big Five Personality Scale using open source data. We argue that large open data sources like this and other collaborative data sources can potentially be combined with MRP to help resolve current challenges of generalizability and replication in psychology. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

[1]  A. Gelman,et al.  Deep Interactions with MRP: Election Turnout and Voting Patterns Among Small Electoral Subgroups , 2013 .

[2]  D. Green,et al.  Modeling Heterogeneous Treatment Effects in Survey Experiments with Bayesian Additive Regression Trees , 2012 .

[3]  D. O. Sears College sophomores in the laboratory: Influences of a narrow data base on social psychology's view of human nature. , 1986 .

[4]  L. R. Goldberg A broad-bandwidth, public domain, personality inventory measuring the lower-level facets of several five-factor models , 1999 .

[5]  R. Little Post-Stratification: A Modeler's Perspective , 1993 .

[6]  John B Carlin,et al.  Multilevel Regression and Poststratification: A Modeling Approach to Estimating Population Quantities From Highly Selected Survey Samples , 2018, American journal of epidemiology.

[7]  Daniel J Simons,et al.  Unskilled and optimistic: Overconfident predictions despite calibrated knowledge of relative skill , 2013, Psychonomic Bulletin & Review.

[8]  A. Gelman Analysis of variance: Why it is more important than ever? , 2005, math/0504499.

[9]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[10]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[11]  P. Henry College Sophomores in the Laboratory Redux: Influences of a Narrow Data Base on Social Psychology's View of the Nature of Prejudice , 2008 .

[12]  John A. Johnson,et al.  The international personality item pool and the future of public-domain personality measures ☆ , 2006 .

[13]  Aki Vehtari,et al.  Bayesian aggregation of average data: An application in drug development , 2016, The Annals of Applied Statistics.

[14]  Daniel J. Simons,et al.  Constraints on Generality (COG): A Proposed Addition to All Empirical Papers , 2017, Perspectives on psychological science : a journal of the Association for Psychological Science.

[15]  Daniel Simpson,et al.  Improving multilevel regression and poststratification with structured priors. , 2019, Bayesian analysis.

[16]  Donna D. Whitsett,et al.  An approach to test for individual differences in the effects of situations without using moderator variables. , 2014, Journal of experimental social psychology.

[17]  Daniel R. Little,et al.  Small is beautiful: In defense of the small-N design , 2018, Psychonomic Bulletin & Review.

[18]  Andrew Gelman,et al.  Bayesian Multilevel Estimation with Poststratification: State-Level Estimates from National Polls , 2004, Political Analysis.

[19]  Jeffrey R. Lax,et al.  Gay Rights in the States: Public Opinion and Policy Responsiveness , 2009, American Political Science Review.

[20]  Devin Caughey,et al.  Public Opinion in Subnational Politics , 2019, The Journal of Politics.

[21]  Shravan Vasishth,et al.  Toward a principled Bayesian workflow in cognitive science. , 2020, Psychological methods.

[22]  Jeffrey R. Lax,et al.  How Should We Estimate Public Opinion in the States , 2009 .

[23]  Colin G. DeYoung,et al.  Gender Differences in Personality across the Ten Aspects of the Big Five , 2011, Front. Psychology.

[24]  D. Mook,et al.  In defense of external invalidity. , 1983 .

[25]  L. R. Goldberg THE DEVELOPMENT OF MARKERS FOR THE BIG-FIVE FACTOR STRUCTURE , 1992 .

[26]  Chris S. Hulleman,et al.  A national experiment reveals where a growth mindset improves achievement , 2019, Nature.

[27]  Shravan Vasishth,et al.  Bayesian linear mixed models using Stan: A tutorial for psychologists, linguists, and cognitive scientists , 2015, 1506.06201.

[28]  Andrew Gelman,et al.  Bayesian Measures of Explained Variance and Pooling in Multilevel (Hierarchical) Models , 2006, Technometrics.

[29]  Richard E. Lucas,et al.  Age differences in the Big Five across the life span: evidence from two national samples. , 2008, Psychology and aging.

[30]  Paul-Christian Bürkner,et al.  brms: An R Package for Bayesian Multilevel Models Using Stan , 2017 .

[31]  Yajuan Si,et al.  Bayesian hierarchical weighting adjustment and survey inference , 2017, 1707.08220.

[32]  Shravan Vasishth,et al.  A tutorial on tting Bayesian linear mixed models using Stan , 2014 .

[33]  Aki Vehtari,et al.  Visualization in Bayesian workflow , 2017, Journal of the Royal Statistical Society: Series A (Statistics in Society).

[34]  Andrew Gelman,et al.  Struggles with survey weighting and regression modeling , 2007, 0710.5005.