Social network mining based on improved vector space model

We employ a method to mine social networks of person entities from Wikipedia in this paper. A person entity is represented as a vector by anchor text set and content text set of his page in Wikipedia using Improved Vector Space Model (IVSM). We use cosine similarity of the vectors to present the similarity of person entities, and at last we get the similarity array of all the person entities. Finally, we extract the social network from the array which shows the relations of person entities. On Wikipedia data, we conduct some experiments on social network analysis, and the experimental results show our social network mining approaches are effective.

[1]  Peter Knees,et al.  Artist Classification with Web-Based Data , 2004, ISMIR.

[2]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[3]  G. Clark,et al.  Reference , 2008 .

[4]  Xinjian Gu,et al.  Domain-Specific Website Recognition Using Hybrid Vector Space Model , 2005, WAIM.

[5]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  M. Harada,et al.  Finding authoritative people from the Web , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[7]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[9]  Jonathan Hodgson Do HTML Tags Flag Semantic Content? , 2001, IEEE Internet Comput..

[10]  Bart Selman,et al.  The Hidden Web , 1997, AI Mag..

[11]  Peter Mika,et al.  Flink: Semantic Web technology for the extraction and analysis of social networks , 2005, J. Web Semant..

[12]  M. Duijn,et al.  Software for social network analysis , 2005 .

[13]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Dong Baoli Specific website subject recognition based on the hybrid vector space model , 2005 .