WebScore: An Effective Page Scoring Approach for Uncertain Web Social Networks

To effectively score pages with uncertainty in web social networks, we first proposed a new concept called transition probability matrix and formally defined the uncertainty in web social networks. Second, we proposed a hybrid page scoring algorithm, called WebScore, based on the PageRank algorithm and three centrality measures including degree, betweenness, and closeness. Particularly, WebScore takes into a full consideration of the uncertainty of web social networks by computing the transition probability from one page to another. The basic idea of WebScore is to: (1) integrate uncertainty into PageRank in order to accurately rank pages, and (2) apply the centrality measures to calculate the importance of pages in web social networks. In order to verify the performance of WebScore, we developed a web social network analysis system which can partition web pages into distinct groups and score them in an effective fashion. Finally, we conducted extensive experiments on real data and the results show that WebScore is effective at scoring uncertain pages with less time deficiency than PageRank and centrality measures based page scoring algorithms.

[1]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[2]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[3]  Matthew Richardson,et al.  The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[4]  Hsinchun Chen,et al.  COPLINK: managing law enforcement data and knowledge , 2003, CACM.

[5]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[6]  Zhang Ling Accelerated Ranking: A New Method to Improve Web Structure Mining Quality , 2004 .

[7]  V. Batagelj,et al.  Generalized blockmodeling with Pajek , 2004, Advances in Methodology and Statistics.

[8]  Hsinchun Chen,et al.  Analyzing Terrorist Networks: A Case Study of the Global Salafi Jihad Network , 2005, ISI.

[9]  Hsinchun Chen,et al.  CrimeNet explorer: a framework for criminal network knowledge discovery , 2005, TOIS.

[10]  Uwe Reuter,et al.  Uncertainty Forecasting in Engineering , 2007 .

[11]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[12]  Qiao Shao Mining Key Members of Crime Networks Based on Personality Trait Simulation Email Analysis System , 2008 .

[13]  Huidong Jin,et al.  KISTCM: knowledge discovery system for traditional Chinese medicine , 2010, Applied Intelligence.

[14]  Huidong Jin,et al.  Constrained k-closest pairs query processing based on growing window in crime databases , 2008, 2008 IEEE International Conference on Intelligence and Security Informatics.

[15]  A. James O’Malley,et al.  The analysis of social networks , 2008, Health Services and Outcomes Research Methodology.

[16]  Wei Liu,et al.  Mining Key Members of Crime Networks Based on Personality Trait Simulation Email Analysis System: Mining Key Members of Crime Networks Based on Personality Trait Simulation Email Analysis System , 2009 .

[17]  Huidong Jin,et al.  PutMode: prediction of uncertain trajectories in moving objects databases , 2010, Applied Intelligence.

[18]  Zhou Ao,et al.  A Survey on the Management of Uncertain Data , 2009 .

[19]  Aoying Zhou,et al.  A Survey on the Management of Uncertain Data: A Survey on the Management of Uncertain Data , 2009 .

[20]  Reynold Cheng,et al.  Querying and Cleaning Uncertain Data , 2009, QuaCon.

[21]  Shaojie Qiao,et al.  Parallel Sequential Pattern Mining of Massive Trajectory Data , 2010, Int. J. Comput. Intell. Syst..

[22]  Shaojie Qiao,et al.  WebRank: A Hybrid Page Scoring Approach Based on Social Network Analysis , 2010, RSKT.

[23]  Hong Li,et al.  HCUBE: A HIERARCHICAL CLUSTERING ALGORITHM USING BLOCKMODELING IN WEB SOCIAL NETWORKS , 2010 .

[24]  Xianyi Zeng,et al.  Characterization of Fashion Themes Using Fuzzy Techniques for Designing New Human Centered Products , 2010, Int. J. Comput. Intell. Syst..

[25]  Da Ruan,et al.  Lessons Learned From Soft Computing Applications At SCK-CEN , 2010, Int. J. Comput. Intell. Syst..

[26]  Shaojie Qiao,et al.  SimRank: A Page Rank approach based on similarity measure , 2010, 2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering.

[27]  Wang Chao Hybrid Page Scoring Algorithm Based on Centrality and PageRank , 2011 .