Temporal Analysis of Social Networks using Three-way DEDICOM

DEDICOM is an algebraic model for analyzing intrinsically asymmetric relationships, such as the balance of trade among nations or the flow of information among organizations or individuals. It provides information on latent components in the data that can be regarded as ''properties'' or ''aspects'' of the objects, and it finds a few patterns that can be combined to describe many relationships among these components. When we apply this technique to adjacency matrices arising from directed graphs, we obtain a smaller graph that gives an idealized description of its patterns. Three-way DEDICOM is a higher-order extension of the model that has certain uniqueness properties. It allows for a third mode of the data, such as time, and permits the analysis of semantic graphs. We present an improved algorithm for computing three-way DEDICOM on sparse data and demonstrate it by applying it to the adjacency tensor of a semantic graph with time-labeled edges. Our application uses the Enron email corpus, from which we construct a semantic graph corresponding to email exchanges among Enron personnel over a series of 44 months. Meaningful patterns are recovered in which the representation of asymmetries adds insight into the social networks at Enron.

[1]  A. Moore,et al.  Dynamic social network analysis using latent space models , 2005, SKDD.

[2]  Tamara G. Kolda,et al.  Higher-order Web link analysis using multilinear algebra , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[3]  Bülent Yener,et al.  Graph Theoretic and Spectral Analysis of Enron Email Data , 2005, Comput. Math. Organ. Theory.

[4]  David B. Skillicorn,et al.  Structure in the Enron Email Dataset , 2005, Comput. Math. Organ. Theory.

[5]  Michael W. Berry,et al.  Email Surveillance Using Non-negative Matrix Factorization , 2005, Comput. Math. Organ. Theory.

[6]  David J. Marchette,et al.  Scan Statistics on Enron Graphs , 2005, Comput. Math. Organ. Theory.

[7]  Huan Liu,et al.  CubeSVD: a novel approach to personalized Web search , 2005, WWW '05.

[8]  Bülent Yener,et al.  Modeling and Multiway Analysis of Chatroom Tensors , 2005, ISI.

[9]  Rasmus Bro,et al.  Multi-way Analysis with Applications in the Chemical Sciences , 2004 .

[10]  Tamara G. Kolda,et al.  MATLAB tensor classes for fast algorithm prototyping. , 2004 .

[11]  R. Rocci A general algorithm to fit constrained DEDICOM models , 2004 .

[12]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[13]  Demetri Terzopoulos,et al.  Multilinear Analysis of Image Ensembles: TensorFaces , 2002, ECCV.

[14]  P. Paatero The Multilinear Engine—A Table-Driven, Least Squares Program for Solving Multilinear Problems, Including the n-Way Parallel Factor Analysis Model , 1999 .

[15]  R. Harshman,et al.  Uniqueness proof for a family of models sharing features of Tucker's three-mode factor analysis and PARAFAC/candecomp , 1996 .

[16]  H. Kiers An alternating least squares algorithms for PARAFAC2 and three-way DEDICOM , 1993 .

[17]  David S. Sibley,et al.  Telecommunications Demand Modelling: An Integrated View , 1990 .

[18]  Y. Takane,et al.  A generalization of Takane's algorithm for dedicom , 1990 .

[19]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[20]  R. Harshman,et al.  A Model for the Analysis of Asymmetric Data in Marketing Research , 1982 .

[21]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[22]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[23]  H. Kaiser,et al.  Oblique factor analytic solutions by orthogonal transformations , 1964 .

[24]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[25]  Brett W. Bader,et al.  The TOPHITS Model for Higher-Order Web Link Analysis∗ , 2006 .

[26]  Jafar Adibi,et al.  The Enron Email Dataset Database Schema and Brief Statistical Report , 2004 .

[27]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .