A New Content-Based Model for Social Network Analysis

This paper presents a novel model for social network analysis in which, rather than analyzing the quantity of relationships (co-authorships, business relations, friendship, etc.), we analyze their communicative content. Text mining and clustering techniques are used to capture the content of communication and to identify the most popular themes. The social analyst is then able to perform a study of the network evolution in terms of the relevant themes of collaboration, the detection of new concepts gaining popularity, and the existence of popular themes that could benefit from better cooperation.The methodology is experimented in the domain of a network of excellence on enterprise interoperability, INTEROP.

[1]  Ulrik Brandes,et al.  Engineering graph clustering: Models and experimental evaluation , 2008, JEAL.

[2]  Goran Nenadic,et al.  Selecting Text Features for Gene Name Classification: from Documents to Terms , 2003, BioNLP@ACL.

[3]  Mohamed S. Kamel,et al.  Efficient phrase-based document indexing for Web document clustering , 2004, IEEE Transactions on Knowledge and Data Engineering.

[4]  Danushka Bollegala,et al.  Measuring semantic similarity between words using web search engines , 2007, WWW '07.

[5]  David Yarowsky,et al.  One Sense Per Discourse , 1992, HLT.

[6]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[7]  Mohsen Jamali,et al.  Different Aspects of Social Network Analysis , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[8]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[9]  Paola Velardi,et al.  A Taxonomy Learning Method and Its Application to Characterize a Scientific Web Community , 2007, IEEE Transactions on Knowledge and Data Engineering.

[10]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[11]  Charles L. A. Clarke,et al.  Frequency Estimates for Statistical Word Similarity Measures , 2003, NAACL.

[12]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[13]  David J. Weir,et al.  Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity , 2005, CL.

[14]  Daniel Gatica-Perez,et al.  Discovering groups of people in Google news , 2006, HCM '06.

[15]  Paola Velardi,et al.  TermExtractor: a Web Application to Learn the Common Terminology of Interest Groups and Research Communities , 2007 .

[16]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[17]  Csaba Legány,et al.  Cluster validity measurement techniques , 2006 .

[18]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[19]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks , 2005, IJCAI.

[20]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.