Exploratory community sensing in social networks

Social networks generally provide an implementation of some kind of groups or communities which users can voluntarily join. Twitter does not have this functionality, and there is no notion of a formal group or community. We propose a method for identification of communities and assignment of semantic meaning to the discussion topics of the resulting communities. Using this analysis method and a sample of roughly a month's worth of Tweets from Twitter's "gardenhose" feed, we demonstrate the discovery of meaningful user communities on Twitter. We examine Twitter data streaming in real time and treat it as a sensor. Twitter is a social network which pioneered microblogging with the messages fitting an SMS, and a variety of clients, browsers, smart phones and PDAs are used for status updates by individuals, businesses, media outlets and even devices all over the world. Often an aggregate trend of such statuses may represent an important development in the world, which has been demonstrated with the Iran and Moldova elections and the anniversary of the Tiananmen in China. We propose using Twitter as a sensor, tracking individuals and communities of interest, and characterizing individual roles and dynamics of their communications. We developed a novel algorithm of community identification in social networks based on direct communication, as opposed to linking. We show ways to find communities of interest and then browse their neighborhoods by either similarity or diversity of individuals and groups adjacent to the one of interest. We use frequent collocations and statistically improbable phrases to summarize the focus of the community, giving a quick overview of its main topics. Our methods provide insight into the largest social sensor network in the world and constitute a platform for social sensing.