Who Says What with Whom : Using Bi-Spectral Clustering to Organize and Analyze Social Media Protest Networks

Social media protest networks involve many participants, from long-time activists to individuals who are engaged only in a particular protest event. We propose a new approach to studying these communities of users.  Our approach hinges on two methodological decisions.  First, rather than study tweets central to one event, we collect full timelines of user activity.  Second, we propose bi-spectral clustering as a scalable computational method for rapidly identifying sub-communities of users and hashtags. Using a large sample of tweets from users who discussed the 2016 protests in Charlotte following the extrajudicial killing of Keith Lamont Scott as a case study, we demonstrate how bi-spectral clustering can be quickly and iteratively applied to sort, sample, and extract ideologically and thematically coherent clusters from a large Twitter network. We also describe how the use of bi-spectral clustering for this task compares to the use of latent Dirichlet allocation, a popular alternative.  Our proposed approach meaningfully extends existing methods to computationally sort and cluster large-scale network data by allowing researchers to look beyond focal hashtags or keywords and situate protest messages within the broader context of messages that users tend to produce, and to do so with fewer ad-hoc modeling decisions.