This paper presents an analysis of the legislative speech records from the 101st-108th U.S. Congresses using machine learning and natural language processing methods. We use word vectors to represent the speeches in both the Senate and the House, and then use text categorization methods to classify the speakers by their ideological positions. The classification accuracy indicates the level of distinction between the liberal and the conservative ideologies. Our experiment results demonstrate an increasing partisanship in the Congress between 1989 and 2006. Ideology classifiers trained on the House speeches can predict the Senators' ideological positions well (House-to-Senate prediction), however the Senate-to-House prediction is less successful. Our results provide evidence for a long-term increase in partisanship in both chambers with the House consistently more ideologically divided than the Senate. 1 School of Information Studies, Syracuse University. Department of Managerial Economics and Decision Sciences (MEDS) and Ford Motor Company Center for Global Citizenship, Kellogg School of Management and Northwestern Institute on Complex Systems (NICO), Northwestern University.
[1]
K. T. Poole,et al.
Congress: A Political-Economic History of Roll Call Voting
,
1997
.
[2]
P. Converse.
The Nature of Belief Systems in Mass Publics
,
2004
.
[3]
Stefan Kaufmann,et al.
Classifying Party Affiliation from Political Speech
,
2008
.
[4]
K. T. Poole,et al.
Patterns of congressional voting
,
1991
.
[5]
Bei Yu,et al.
Exploring the characteristics of opinion expressions for political opinion classification
,
2008,
DG.O.
[6]
Keith T. Poole,et al.
Changing minds? Not in Congress!
,
2007
.
[7]
M. Laver,et al.
Extracting Policy Positions from Political Texts Using Words as Data
,
2003,
American Political Science Review.