Homophily of Neighborhood in Graph Relational Classifier

Quality of collective inference relational graph classifier depends on a degree of homophily in a classified graph. If we increase homophily in the graph, the classifier would assign class-membership to the instances with reduced error rate. We propose to substitute traditionally used graph neighborhood method (based on direct neighborhood of vertex) with local graph ranking algorithm (activation spreading), which provides wider set of neighboring vertices and their weights. We demonstrate that our approach increases homophily in the graph by inferring optimal homophily distribution of a binary Simple Relational Classifier in an unweighted graph. We validate this ability also experimentally using the Social Network of the Slovak Companies dataset.

[1]  Ján Suchal On finding power method in spreading activation search , 2008, SOFSEM.

[2]  Lars Schmidt-Thieme,et al.  Relational Ensemble Classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[3]  Jennifer Neville,et al.  Why collective inference improves relational classification , 2004, KDD.

[4]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[5]  Maciej Ceglowski,et al.  Semantic Search of Unstructured Data using Contextual Network Graphs , 2003 .

[6]  Stefan Wrobel,et al.  Bias-free hypothesis evaluation in multirelational domains , 2005, MRDM '05.

[7]  Fan Chung Graham,et al.  Internet and Network Economics, Third International Workshop, WINE 2007, San Diego, CA, USA, December 12-14, 2007, Proceedings , 2007, WINE.

[8]  Jennifer Neville,et al.  Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning , 2002, ICML.

[9]  Mária Bieliková,et al.  SOFSEM 2008: Theory and Practice of Computer Science, 34th Conference on Current Trends in Theory and Practice of Computer Science, Nový Smokovec, Slovakia, January 19-25, 2008, Proceedings , 2008, SOFSEM.

[10]  Matthew O. Jackson,et al.  Average Distance, Diameter, and Clustering in Social Networks with Homophily , 2008, WINE.

[11]  Ben Taskar,et al.  Probabilistic Models of Text and Link Structure for Hypertext Classification , 2001 .

[12]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[13]  Qiang Yang,et al.  Reinforcing Web-object Categorization Through Interrelationships , 2006, Data Mining and Knowledge Discovery.

[14]  Christos Faloutsos,et al.  Using ghost edges for classification in sparsely labeled networks , 2008, KDD.

[15]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[16]  Foster Provost,et al.  A Simple Relational Classifier , 2003 .