Detecting Multiple Stochastic Network Motifs in Network Data

Network motif detection methods are known to be important for studying the structural properties embedded in network data. Extending them to stochastic ones help capture the interaction uncertainties in stochastic networks. In this paper, we propose a finite mixture model to detect multiple stochastic motifs in network data with the conjecture that interactions to be modeled in the motifs are of stochastic nature. Component-wise Expectation Maximization algorithm is employed so that both the optimal number of motifs and the parameters of their corresponding probabilistic models can be estimated. For evaluating the effectiveness of the algorithm, we applied the stochastic motif detection algorithm to both synthetic and benchmark datasets. Also, we discuss how the obtained stochastic motifs could help the domain experts to gain better insights on the over-represented patterns in the network data.

[1]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[2]  Mihail N. Kolountzakis,et al.  Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning , 2010, Internet Math..

[3]  Katarzyna Musial,et al.  Local Topology of Social Network Based on Motif Analysis , 2008, KES.

[4]  Kai Liu,et al.  Detecting multiple stochastic network motifs in network data , 2012, Knowledge and Information Systems.

[5]  Michael Lässig,et al.  Local graph alignment and motif search in biological networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Padraig Cunningham,et al.  Network Analysis of Recurring YouTube Spam Campaigns , 2012, ICWSM.

[7]  Qi He,et al.  Communication motifs: a tool to characterize social communications , 2010, CIKM.

[8]  Sahar Asadi,et al.  Kavosh: a new algorithm for finding network motifs , 2009, BMC Bioinformatics.

[9]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[10]  Joshua A. Grochow,et al.  Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking , 2007, RECOMB.

[11]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[12]  Chun-Hsi Huang,et al.  Biological network motif detection: principles and practice , 2012, Briefings Bioinform..

[13]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Gilles Celeux,et al.  A Component-Wise EM Algorithm for Mixtures , 2001, 1201.5913.

[15]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[16]  Uri Alon,et al.  Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs , 2004, Bioinform..

[17]  Sebastian Wernicke,et al.  Efficient Detection of Network Motifs , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Shlomo Havlin,et al.  How people interact in evolving online affiliation networks , 2011, ArXiv.

[19]  Katherine Faust,et al.  7. Very Local Structure in Social Networks , 2007 .

[20]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[21]  Jon M. Kleinberg,et al.  The structure of information pathways in a social communication network , 2008, KDD.

[22]  Ting Chen,et al.  Network motif identification in stochastic networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[23]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[24]  Scott Fortin The Graph Isomorphism Problem , 1996 .

[25]  Luigi Palopoli,et al.  New Trends in Graph Mining: Structural and Node-Colored Network Motifs , 2010, Int. J. Knowl. Discov. Bioinform..

[26]  Falk Schreiber,et al.  MAVisto: a tool for the exploration of network motifs , 2005, Bioinform..

[27]  Mong-Li Lee,et al.  NeMoFinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs , 2006, KDD '06.

[28]  Falk Schreiber,et al.  Frequency Concepts and Pattern Detection for the Analysis of Motifs in Networks , 2005, Trans. Comp. Sys. Biology.

[29]  A Vázquez,et al.  The topological relationship between the large-scale attributes and local interaction patterns of complex networks , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Yaneer Bar-Yam,et al.  Time-Dependent Complex Networks: Dynamic Centrality, Dynamic Motifs, and Cycles of Social Interactions , 2009 .

[31]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[32]  A. Clauset Finding local community structure in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[34]  Kai Liu,et al.  Stochastic Network Motif Detection in Social Media , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[35]  Charalampos E. Tsourakakis Counting triangles in real-world networks using projections , 2011, Knowledge and Information Systems.

[36]  Rui Jiang,et al.  Bayesian Models and Gibbs Sampling Strategies for Local Graph Alignment and Motif Identification in Stochastic Biological Networks , 2009, Commun. Inf. Syst..

[37]  Yiannis Kompatsiaris,et al.  Community detection in Social Media , 2012, Data Mining and Knowledge Discovery.

[38]  F. Schreiber,et al.  MODA: an efficient algorithm for network motif discovery in biological networks. , 2009, Genes & genetic systems.

[39]  Jure Leskovec,et al.  Signed networks in social media , 2010, CHI.

[40]  David L. Dowe,et al.  Minimum Message Length and Kolmogorov Complexity , 1999, Comput. J..

[41]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[42]  F. Harary,et al.  STRUCTURAL BALANCE: A GENERALIZATION OF HEIDER'S THEORY1 , 1977 .

[43]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[44]  Pang-Ning Tan,et al.  Simultaneous classification and community detection on heterogeneous network data , 2012, Data Mining and Knowledge Discovery.

[45]  Tamara G. Kolda,et al.  Triadic Measures on Graphs: The Power of Wedge Sampling , 2012, SDM.

[46]  Etienne Birmele,et al.  Detecting local network motifs , 2010, 1007.1410.

[47]  Ian Buck,et al.  Fast Parallel Expectation Maximization for Gaussian Mixture Models on GPUs Using CUDA , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.

[48]  Katarzyna Musial,et al.  Motif-Based Analysis of Social Position Influence on Interconnection Patterns in Complex Social Network , 2009, 2009 First Asian Conference on Intelligent Information and Database Systems.

[49]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[50]  Jon M Kleinberg,et al.  Hubs, authorities, and communities , 1999, CSUR.

[51]  Jari Saramäki,et al.  Temporal motifs in time-dependent networks , 2011, ArXiv.

[52]  S. Mangan,et al.  Structure and function of the feed-forward loop network motif , 2003, Proceedings of the National Academy of Sciences of the United States of America.