Identifying Important Nodes in Heterogenous Networks

This is a position paper that presents a new approach to identifying important nodes or entities in a complex heterogeneous network. We provide a novel definition of an importance score based on a statistical model: An individual is important to the extent that including an individual explicitly in the model improves the data fit of the model more than it increases the model's complexity. We apply techniques from statistical-relational learning, a recent field that combines AI and machine learning, to identify statistically important individuals in a scalable manner. We investigate empirically our approach with the OPTA soccer data set for the English premier league.

[1]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[2]  Hassan Khosravi,et al.  Learning graphical models for relational data via lattice search , 2012, Machine Learning.

[3]  Gianluca Baio,et al.  Bayesian hierarchical model for the prediction of football results , 2010 .

[4]  Wolfgang Nejdl,et al.  A Vector Space Model for Ranking Entities and Its Application to Expert Search , 2009, ECIR.

[5]  Walter R. Gilks,et al.  BUGS - Bayesian inference Using Gibbs Sampling Version 0.50 , 1995 .

[6]  Virgílio A. F. Almeida,et al.  Can complex network metrics predict the behavior of NBA teams? , 2008, KDD.

[7]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Yuke Zhu,et al.  Modelling relational statistics with Bayes Nets , 2013, Machine Learning.

[9]  Hongbo Deng,et al.  Formal Models for Expert Finding on DBLP Bibliography Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[10]  Hongyan Liu,et al.  Exploring optimization of semantic relationship graph for multi-relational Bayesian classification , 2009, Decis. Support Syst..

[11]  Oliver Schulte,et al.  A Tractable Pseudo-Likelihood Function for Bayes Nets Applied to Relational Data , 2011, SDM.

[12]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[13]  Jennifer Neville,et al.  Relational Dependency Networks , 2007, J. Mach. Learn. Res..

[14]  Hongyuan Zha,et al.  Co-ranking Authors and Documents in a Heterogeneous Network , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[15]  Jiebo Luo,et al.  RankCompete: Simultaneous ranking and clustering of information networks , 2012, Neurocomputing.

[16]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[17]  R. N. Onody,et al.  Complex network study of Brazilian soccer players. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Peter A. Flach,et al.  Hierarchical Bayesian Networks: A Probabilistic Reasoning Model for Structured Domains , 2002 .

[19]  Norman E. Fenton,et al.  1 2 3 4 5 6 7 , 2001 .

[20]  M. de Rijke,et al.  Broad expertise retrieval in sparse data environments , 2007, SIGIR.