Dynamic Shaker Detection from Evolving Entities

Finding the most influential entities as well as conducting causality analysis is an important topic in economics, healthcare, sensor networks, etc. One famous example in economics is the bankruptcy of Lehman Brothers that triggered the 2008 global financial crisis. In recent years, some works were proposed to infer the causal relationships among several entities, and subsequently find the most influential ones (i.e., shakers). However, most of the previous works assume that the causal relationships and the shakers are static. In other words, they are assumed to be stable over time. This assumption may not necessarily be true, especially when we study volatile entities or long-term time series. In this paper, we propose a dynamic model called “DShaker” to capture the evolving causal relationships and dynamic shakers. The intuition is to model the causality propagation into a graph called “dynamic cascading graph”. We then find the optimal cascading graphs, by maximizing their likelihoods in a non-convex multi-objective optimization formulation. We solve it by mapping into a trace norm minimization problem. Experiments included three datasets in social sciences. The proposed method can effectively capture those entities with increasing impacts, while existing methods missed most of them. For example, in the experiment of studying the banks’ statistics from 1998 to 2007 (before the financial crisis), the proposed method successfully captures Lehman Brothers as one of the most precarious banks in subprime loans.

[1]  Dimitrios Gunopulos,et al.  Finding effectors in social networks , 2010, KDD.

[2]  Stephen P. Boyd,et al.  A rank minimization heuristic with application to minimum order system approximation , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[3]  Quentin Smith,et al.  Causation and the Logical Impossibility of a Divine Cause , 1996 .

[4]  Dimitrios Gunopulos,et al.  Mining Time Series Data , 2005, Data Mining and Knowledge Discovery Handbook.

[5]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[6]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[7]  Philip S. Yu,et al.  Discovering shakers from evolving entities via cascading graph inference , 2011, KDD.

[8]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[9]  Philip S. Yu,et al.  Identifying the influential bloggers in a community , 2008, WSDM '08.

[10]  Ioannis Tsamardinos Causal Data Mining in Bioinformatics , 2007, ERCIM News.

[11]  Christos Faloutsos,et al.  Patterns of Cascading Behavior in Large Blog Graphs , 2007, SDM.

[12]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[13]  Philip S. Yu,et al.  Detecting Leaders from Correlated Time Series , 2010, DASFAA.

[14]  Stephen P. Ellner,et al.  Chaos in a Noisy World: New Methods and Evidence from Time-Series Analysis , 1995, The American Naturalist.

[15]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[16]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[17]  Masahiro Kimura,et al.  Extracting Influential Nodes for Information Diffusion on a Social Network , 2007, AAAI.