An Approach to Incorporate Texts into a Social Network Analysis of Communication Graphs

Social network analysis (SNA) provides tools to examine relationships between people. Text mining (TM) allows capturing the text they produce in Web 2.0 applications, for example, however it neglects their social structure. This paper applies an approach to combine the two methods named "content-based SNA" (CB-SNA). Using the R mailing lists, R-help and R-devel, we show how this combination can be used to describe people's interests and to find out if authors who have similar interests actually communicate. We find that the expected positive relationship between sharing interests and communicating gets stronger as the centrality scores of authors in the communication networks increase.