DUKE: A Solution for Discovering Neighborhood Patterns in Ego Networks

Given the rapid growth of social media websites and the ease of aggregating ever-richer social data, an inevitable research question that can be expected to emerge is whether different interaction patterns of individuals and their meaningful interpretation can be captured for social network analysis. In this work, we present a novel solution that discovers occurrences of prototypical ’ego network’ patterns from social media and mobile-phone networks, to provide a data-driven instrument to be used in behavioral sciences for graph interpretations. We analyze nine datasets gathered from social media websites and mobile phones, together with 13 network measures, and three unsupervised clustering algorithms. Further, we use an unsupervised feature similarity technique to reduce redundancy and extract compact features from the data. The reduced feature subsets are then used to discover ego patterns using various clustering techniques. By cluster analysis, we discover that eight distinct ego neighborhood patterns or ego graphs have emerged. This categorization allows concise analysis of users’ data as they change over time. We provide fine-grained analysis for the validity and quality of clustering results. We perform clustering verification based on the following three intuitions: i) analyzing the clustering patterns for the same set of users crawled from three social media networks, ii) associating metadata information with the clusters and evaluating their performance on real networks, iii) studying selected participants over an extended period to analyze their behavior.

[1]  Stefan Wehrli,et al.  Personality on Social Network Sites: An Application of the Five Factor Model , 2008 .

[2]  S. Srivastava,et al.  The Big Five Trait taxonomy: History, measurement, and theoretical perspectives. , 1999 .

[3]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[4]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[5]  Alex Pentland,et al.  Social fMRI: Investigating and shaping social mechanisms in the real world , 2011, Pervasive Mob. Comput..

[6]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[7]  Massimo Marchiori,et al.  Economic small-world behavior in weighted networks , 2003 .

[8]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[9]  Etienne Huens,et al.  Data for Development: the D4D Challenge on Mobile Phone Data , 2012, ArXiv.

[10]  Jure Leskovec,et al.  Discovering social circles in ego networks , 2012, ACM Trans. Knowl. Discov. Data.

[11]  Marián Boguñá,et al.  Extracting the multiscale backbone of complex weighted networks , 2009, Proceedings of the National Academy of Sciences.

[12]  Christophe Prieur,et al.  Structure of Neighborhoods in a Large Social Network , 2009, 2009 International Conference on Computational Science and Engineering.

[13]  Nicu Sebe,et al.  Friends don't lie: inferring personality traits from social network structure , 2012, UbiComp.

[14]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[15]  Marco Conti,et al.  Analysis of Ego Network Structure in Online Social Networks , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[16]  Alex Pentland,et al.  Fortune Monitor or Fortune Teller: Understanding the Connection between Interaction Patterns and Financial Status , 2011, 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing.

[17]  D. Gática-Pérez,et al.  Towards rich mobile phone datasets: Lausanne data collection campaign , 2010 .

[18]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[19]  Lise Getoor,et al.  Ego-centric Graph Pattern Census , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[20]  LeskovecJure,et al.  Discovering social circles in ego networks , 2014 .

[21]  Alex Pentland,et al.  Social sensing: obesity, unhealthy eating and exercise in face-to-face networks , 2010, Wireless Health.

[22]  Alex Pentland,et al.  Sensing the "Health State" of a Community , 2012, IEEE Pervasive Computing.

[23]  Dino Pedreschi,et al.  Uncovering Hierarchical and Overlapping Communities with a Local-First Approach , 2014, TKDD.

[24]  Huan Liu,et al.  Feature Selection for Clustering , 2000, Encyclopedia of Database Systems.

[25]  Philip S. Yu,et al.  Mining top-K large structural patterns in a massive network , 2011, Proc. VLDB Endow..

[26]  Philip Chan,et al.  Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[27]  Daniel Gatica-Perez,et al.  Who's Who with Big-Five: Analyzing and Classifying Personality Traits with Smartphones , 2011, 2011 15th Annual International Symposium on Wearable Computers.

[28]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[30]  Christos Faloutsos,et al.  It's who you know: graph mining using recursive structural features , 2011, KDD.

[31]  Christos Faloutsos,et al.  Fast best-effort pattern matching in large attributed graphs , 2007, KDD '07.

[32]  Nicholas A. Christakis,et al.  Egocentric Social Network Structure, Health, and Pro-Social Behaviors in a National Panel Study of Americans , 2012, PloS one.

[33]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.