Mining heterogeneous information networks: a structural analysis approach

Most objects and data in the real world are of multiple types, interconnected, forming complex, heterogeneous but often semi-structured information networks. However, most network science researchers are focused on homogeneous networks, without distinguishing different types of objects and links in the networks. We view interconnected, multityped data, including the typical relational database data, as heterogeneous information networks, study how to leverage the rich semantic meaning of structural types of objects and links in the networks, and develop a structural analysis approach on mining semi-structured, multi-typed heterogeneous information networks. In this article, we summarize a set of methodologies that can effectively and efficiently mine useful knowledge from such information networks, and point out some promising research directions.

[1]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[2]  Philip S. Yu,et al.  Integrating meta-path selection with user-guided object clustering in heterogeneous information networks , 2012, KDD.

[3]  Yizhou Sun,et al.  RankClus: integrating clustering with ranking for heterogeneous information network analysis , 2009, EDBT '09.

[4]  Jignesh M. Patel,et al.  Efficient aggregation for graph summarization , 2008, SIGMOD Conference.

[5]  C. Lee Giles The future of citeseer: citeseer x , 2006 .

[6]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[7]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[8]  Bo Zhao,et al.  Community evolution detection in dynamic heterogeneous information networks , 2010, MLG '10.

[9]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[10]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[11]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[12]  Yizhou Sun,et al.  Query-driven discovery of semantically similar substructures in heterogeneous networks , 2012, KDD.

[13]  E. Rogers,et al.  Diffusion of Innovations, 5th Edition , 2003 .

[14]  Bo Zhao,et al.  Probabilistic topic models with biased propagation on heterogeneous information networks , 2011, KDD.

[15]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[16]  Chris Clifton,et al.  Knowledge discovery from transportation network data , 2005, 21st International Conference on Data Engineering (ICDE'05).

[17]  N. Christakis,et al.  The Spread of Obesity in a Large Social Network Over 32 Years , 2007, The New England journal of medicine.

[18]  Xiang Li,et al.  Learning Hierarchical Relationships among Partially Ordered Objects with Heterogeneous Attributes and Links , 2012, SDM.

[19]  Yizhou Sun,et al.  iTopicModel: Information Network-Integrated Topic Modeling , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[20]  Yizhou Sun,et al.  Graph Regularized Transductive Classification on Heterogeneous Information Networks , 2010, ECML/PKDD.

[21]  Philip S. Yu,et al.  Relevance search in heterogeneous networks , 2012, EDBT '12.

[22]  Charu C. Aggarwal,et al.  Relation Strength-Aware Clustering of Heterogeneous Information Networks with Incomplete Attributes , 2012, Proc. VLDB Endow..

[23]  Divesh Srivastava,et al.  Global detection of complex copying relationships between sources , 2010, Proc. VLDB Endow..

[24]  Jiawei Han,et al.  Modeling and exploiting heterogeneous bibliographic networks for expertise ranking , 2012, JCDL '12.

[25]  Margaret Werner-Washburne,et al.  Integrative Construction and Analysis of Condition-specific Biological Networks , 2008, AAAI.

[26]  Charu C. Aggarwal,et al.  When will it happen?: relationship prediction in heterogeneous information networks , 2012, WSDM '12.

[27]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[28]  Jiawei Han,et al.  Graph cube: on warehousing and OLAP multidimensional networks , 2011, SIGMOD '11.

[29]  Philip S. Yu,et al.  Graph OLAP: Towards Online Analytical Processing on Graphs , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[30]  Charu C. Aggarwal,et al.  Social Network Data Analytics , 2011 .

[31]  Philip S. Yu,et al.  Object Distinction: Distinguishing Objects with Identical Names , 2007, 2007 IEEE 23rd International Conference on Data Engineering.