Clustering via Information Access in a Network

Information flow in a graph (say, a social network) has typically been modeled using standard influence propagation methods, with the goal of determining the most effective ways to spread information widely. More recently, researchers have begun to study the differing access to information of individuals within a network. This previous work suggests that information access is itself a potential aspect of privilege based on network position. While concerns about fairness usually focus on differences between demographic groups, characterizing network position may itself give rise to new groups for study. But how do we characterize position? Rather than using standard grouping methods for graph clustering, we design and explore a clustering that explicitly incorporates models of how information flows on a network. Our goal is to identify clusters of nodes that are similar based on their access to information across the network. We show, both formally and experimentally, that the resulting clustering method is a new approach to network clustering. Using a wide variety of datasets, our experiments show that the introduced clustering technique clusters individuals together who are similar based on an external information access measure.

[1]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[2]  Danah Boyd,et al.  Gaps in Information Access in Social Networks? , 2019, WWW.

[3]  Yuchen Li,et al.  Influence Maximization on Social Graphs: A Survey , 2018, IEEE Transactions on Knowledge and Data Engineering.

[4]  Luca Trevisan,et al.  Partitioning into Expanders , 2014, SODA.

[5]  Rik Sarkar,et al.  Multi-scale Attributed Node Embedding , 2019, ArXiv.

[6]  J. Coleman,et al.  Medical Innovation: A Diffusion Study. , 1967 .

[7]  Herbert W. Hethcote,et al.  The Mathematics of Infectious Diseases , 2000, SIAM Rev..

[8]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[9]  Eric Rice,et al.  Group-Fairness in Influence Maximization , 2019, IJCAI.

[10]  Xiaoming Fu,et al.  Building and Analyzing a Global Co-Authorship Network Using Google Scholar Data , 2017, WWW.

[11]  Mark E. J. Newman,et al.  Friendship networks and social status , 2012, Network Science.

[12]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[13]  Daniel B. Larremore,et al.  Gender, Productivity, and Prestige in Computer Science Faculty Hiring Networks , 2016, WWW.

[14]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  Charu C. Aggarwal,et al.  A Survey of Clustering Algorithms for Graph Data , 2010, Managing and Mining Graph Data.

[17]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[18]  Daniel B. Larremore,et al.  Systematic inequality and hierarchy in faculty hiring networks , 2015, Science Advances.

[19]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[20]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Mark S. Granovetter Threshold Models of Collective Behavior , 1978, American Journal of Sociology.

[22]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[23]  Ana-Andreea Stoica,et al.  Fairness in Social Influence Maximization , 2019, WWW.

[24]  Aaron Clauset,et al.  Prestige drives epistemic inequality in the diffusion of scientific ideas , 2018, EPJ Data Science.

[25]  R. Burt Social Contagion and Innovation: Cohesion versus Structural Equivalence , 1987, American Journal of Sociology.