Temporal Analysis of Semantic Graphs Using ASALSAN

ASALSAN is a new algorithm for computing three-way DEDICOM, which is a linear algebra model for analyzing intrinsically asymmetric relationships, such as trade among nations or the exchange of emails among individuals, that incorporates a third mode of the data, such as time. ASALSAN is unique because it enables computing the three-way DEDICOM model on large, sparse data. A nonnegative version of ASALSAN is described as well. When we apply these techniques to adjacency arrays arising from directed graphs with edges labeled by time, we obtain a smaller graph on latent semantic dimensions and gain additional information about their changing relationships over time. We demonstrate these techniques on international trade data and the Enron email corpus to uncover latent components and their transient behavior. The mixture of roles assigned to individuals by ASALSAN showed strong correspondence with known job classifications and revealed the patterns of communication between these roles. Changes in the communication pattern over time, e.g., between top executives and the legal department, were also apparent in the solutions.

[1]  Andrew McCallum,et al.  The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email , 2005 .

[2]  Bülent Yener,et al.  Modeling and Multiway Analysis of Chatroom Tensors , 2005, ISI.

[3]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[4]  Tamara G. Kolda,et al.  Temporal Analysis of Social Networks using Three-way DEDICOM , 2006 .

[5]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[6]  Tamara G. Kolda,et al.  MATLAB Tensor Toolbox , 2006 .

[7]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[8]  Akinori Okada,et al.  Multidimensional Scaling of Asymmetric Proximities with a Dominance Point , 2006, GfKl.

[9]  Philip S. Yu,et al.  Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[10]  Huan Liu,et al.  CubeSVD: a novel approach to personalized Web search , 2005, WWW '05.

[11]  Tamara G. Kolda,et al.  Pattern Analysis of Directed Graphs Using DEDICOM: An Application to Enron Email , 2006 .

[12]  Kathleen M. Carley,et al.  Exploration of communication networks from the Enron email corpus , 2005 .

[13]  Tamara G. Kolda,et al.  Efficient MATLAB Computations with Sparse and Factored Tensors , 2007, SIAM J. Sci. Comput..

[14]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[15]  Y. Takane,et al.  Constrained DEDICOM , 1993 .

[16]  Michael W. Berry,et al.  Discussion Tracking in Enron Email using PARAFAC. , 2008 .

[17]  Rasmus Bro,et al.  Multi-way Analysis with Applications in the Chemical Sciences , 2004 .

[18]  Jimeng Sun,et al.  Beyond streams and graphs: dynamic tensor analysis , 2006, KDD '06.

[19]  Naohito Chino,et al.  A GRAPHICAL TECHNIQUE FOR REPRESENTING THE ASYMMETRIC RELATIONSHIPS BETWEEN N OBJECTS , 1978 .

[20]  R. Harshman,et al.  Uniqueness proof for a family of models sharing features of Tucker's three-mode factor analysis and PARAFAC/candecomp , 1996 .

[21]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[22]  Bülent Yener,et al.  Graph Theoretic and Spectral Analysis of Enron Email Data , 2005, Comput. Math. Organ. Theory.

[23]  Tamara G. Kolda,et al.  Categories and Subject Descriptors: G.4 [Mathematics of Computing]: Mathematical Software— , 2022 .

[24]  Tamara G. Kolda,et al.  Higher-order Web link analysis using multilinear algebra , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[25]  David E. Booth,et al.  Multi-Way Analysis: Applications in the Chemical Sciences , 2005, Technometrics.

[26]  R. Harshman,et al.  A Model for the Analysis of Asymmetric Data in Marketing Research , 1982 .

[27]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[28]  Brett W. Bader,et al.  The TOPHITS Model for Higher-Order Web Link Analysis∗ , 2006 .

[29]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[30]  R. Rocci A general algorithm to fit constrained DEDICOM models , 2004 .

[31]  A. Moore,et al.  Dynamic social network analysis using latent space models , 2005, SKDD.

[32]  Y. Takane,et al.  A generalization of Takane's algorithm for dedicom , 1990 .