PAV: A novel model for ranking heterogeneous objects in bibliographic information networks

Bibliographic information networks, formed by online bibliographic databases, such as ACM Digital Library and IEEE/IET Electronic Library, contain abundant information about authors, papers, venues (journals/conferences), and have been widely studies in recent years. However, few studies examine the problem of ranking objects in these networks. In this paper, we study this problem and present a novel model, called PAV, for ranking heterogeneous objects, such as authors, papers, and venues. Based on PAV model, we transform the problem of ranking objects into the problem of estimating probability distribution. We propose an efficient algorithm to estimate probability parameters by use of the fact that the PAV model is a regular Markov chain. For evaluating PAV model, we apply it on one real dataset, which was crawled from ACM Digital Library. The experimental results show that the proposed model is effective.

[1]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[2]  Kishor S. Trivedi Probability and Statistics with Reliability, Queuing, and Computer Science Applications , 1984 .

[3]  Yizhou Sun,et al.  RankClus: integrating clustering with ranking for heterogeneous information network analysis , 2009, EDBT '09.

[4]  L. Egghe An improvement of the h-index: the g-index , 2006 .

[5]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[6]  James D. Hamilton A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle , 1989 .

[7]  Philip S. Yu,et al.  LinkClus: efficient clustering via heterogeneous semantic links , 2006, VLDB.

[8]  Quentin L. Burrell,et al.  Hirsch index or Hirsch rate? Some thoughts arising from Liang’s data , 2007, Scientometrics.

[9]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[10]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[11]  K. Hopkin,et al.  HOW TO WOW A STUDY SECTION : A GRANTSMANSHIP LESSON , 1998 .

[12]  Yizhou Sun,et al.  BibNetMiner: mining bibliographic information networks , 2008, SIGMOD Conference.

[13]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[14]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[15]  Vagelis Hristidis,et al.  ObjectRank: a system for authority-based search on databases , 2006, SIGMOD Conference.

[16]  Terry Williams,et al.  Probability and Statistics with Reliability, Queueing and Computer Science Applications , 1983 .

[17]  Ronald Rousseau,et al.  Article impact calculated over arbitrary periods , 2005, J. Assoc. Inf. Sci. Technol..

[18]  L. Egghe,et al.  Theory and practise of the g-index , 2006, Scientometrics.

[19]  Ruoming Jin,et al.  A Topic Modeling Approach and Its Integration into the Random Walk Framework for Academic Search , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[20]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[21]  Charles M. Grinstead,et al.  Introduction to probability , 1986, Statistics for the Behavioural Sciences.

[22]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[23]  Yizhou Sun,et al.  iTopicModel: Information Network-Integrated Topic Modeling , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[24]  David Lou,et al.  FOG: Fragment Optimized Growth Algorithm for the de Novo Generation of Molecules Occupying Druglike Chemical Space , 2009, J. Chem. Inf. Model..

[25]  John N. Tsitsiklis,et al.  Introduction to Probability , 2002 .

[26]  Stephen J. Bensman,et al.  Scientific and Technical Serials Holdings Optimization in an Inefficient Market: A LSU Serials Redesign Project Exercise , 1998 .

[27]  E Garfield,et al.  Long-term vs. short-term journal impact: does it matter? , 1998, The Physiologist.

[28]  Jie Tang,et al.  Social Network Extraction of Academic Researchers , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).