A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases

Political scientists lack methods to efficiently measure the priorities political actors emphasize in statements. To address this limitation, I introduce a statistical model that attends to the structure of political rhetoric when measuring expressed priorities: statements are naturally organized by author. The expressed agenda model exploits this structure to simultaneously estimate the topics in the texts, as well as the attention political actors allocate to the estimated topics. I apply the method to a collection of over 24,000 press releases from senators from 2007, which I demonstrate is an ideal medium to measure how senators explain their work in Washington to constituents. A set of examples validates the estimated priorities and demonstrates their usefulness for testing theories of how members of Congress communicate with constituents. The statistical model and its extensions will be made available in a forthcoming free software package for the R computing language.

[1]  渡辺 慧,et al.  Knowing and guessing : a quantitative study of inference and information , 1969 .

[2]  Richard F. Fenno Congressmen in committees , 1973 .

[3]  John W. Kingdon Congressmen's voting decisions , 1973 .

[4]  David R. Mayhew Congress: The Electoral Connection , 1975 .

[5]  Richard F. Fenno Home Style : House Members in Their Districts , 1978 .

[6]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[7]  Diana Evans Yiannakis House Members' Communication Styles: Newsletters and Press Releases , 1982, The Journal of Politics.

[8]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[9]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[10]  T. Cook Press Secretaries and Media Strategies in the House of Representatives: Deciding Whom to Pursue , 1988 .

[11]  William H. Flanigan,et al.  The Personal Vote: Constituency Service and Electoral Independence. , 1988 .

[12]  The Personal Vote. Constituency Service and Electoral Indipendence , 1989 .

[13]  T. Cook,et al.  Making Laws and Making News: Media Strategies in the U.S. House of Representatives , 1989 .

[14]  R. Arnold The logic of congressional action , 1990 .

[15]  A. Gelman,et al.  Estimating Incumbency Advantage Without Bias , 1990 .

[16]  Bruce E. Cain,et al.  The Personal Vote: Constituency Service and Electoral Independence , 1990 .

[17]  G. King,et al.  Constituency Service and Incumbency Advantage , 1991, British Journal of Political Science.

[18]  Dennis F. Thompson,et al.  Democracy and Disagreement , 1996 .

[19]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[20]  John R. Petrocik Issue Ownership in Presidential Elections, with a 1980 Case Study , 1996 .

[21]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[22]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[23]  P. Guttorp,et al.  Statistical Interpretation of Species Composition , 2001 .

[24]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[25]  A. Simon,et al.  The Winning Message: Candidate Behavior, Campaign Discourse, and Democracy , 2002 .

[26]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[27]  K. Hill,et al.  Symbolic Speeches in the U.S. Senate and Their Representational Implications , 2002, The Journal of Politics.

[28]  Joydeep Ghosh,et al.  A Unified Framework for Model-based Clustering , 2003, J. Mach. Learn. Res..

[29]  Jane J. Mansbridge Rethinking Representation , 2003, American Political Science Review.

[30]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[31]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[32]  Bo Wang,et al.  Convergence and Asymptotic Normality of Variational Bayesian Approximations for Expon , 2004, UAI.

[33]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[34]  L. Sigelman,et al.  Avoidance or Engagement? Issue Convergence in U.S. Presidential Campaigns, 1960–2000 , 2004 .

[35]  M. McCombs Setting the Agenda: The Mass Media and Public Opinion , 2004 .

[36]  R. Arnold Congress, the Press, and Political Accountability , 2004 .

[37]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[38]  Inderjit S. Dhillon,et al.  Clustering on the Unit Hypersphere using von Mises-Fisher Distributions , 2005, J. Mach. Learn. Res..

[39]  L. Fowler Congressional Communication: Content and Consequences , 2005, Perspectives on Politics.

[40]  T. Sulkin Issue Politics in Congress , 2005 .

[41]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[42]  Brian F. Schaffner Local News Coverage and the Incumbency Advantage in the U.S. House , 2006 .

[43]  E. Armstrong,et al.  Whose deaths matter? Mortality, advocacy, and attention to disease in the mass media. , 2006, Journal of health politics, policy and law.

[44]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[45]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[46]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[47]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[48]  Kenneth F. Scheve,et al.  Estimating the Effect of Elite Communications on Public Opinion Using Instrumental Variables , 2007 .

[49]  Gary King,et al.  Extracting Systematic Social Science Meaning from Text 1 , 2007 .

[50]  Dustin Hillard,et al.  Computer-Assisted Topic Classification for Mixed-Methods Social Science Research , 2008 .

[51]  David M. Blei,et al.  Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation , 2008, NIPS.

[52]  Andrew McCallum,et al.  Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[53]  Frances E. Lee Dividers, Not Uniters: Presidential Leadership and Senate Partisanship, 1981-2004 , 2008, The Journal of Politics.

[54]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[55]  Gary King,et al.  Quantitative Discovery from Qualitative Information: A General-Purpose Document Clustering Methodology , 2009 .

[56]  Jeff Gill,et al.  Circular Data in Political Science and How to Handle It , 2010, Political Analysis.

[57]  Dragomir R. Radev,et al.  How to Analyze Political Attention with Minimal Assumptions and Costs , 2010 .

[58]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..