Cohesive subgroup model for graph-based text mining

A k-plex is a graph theoretic generalization of a clique, introduced in social network analysis (SNA) to model tightly knit social subgroups referred to as cohesive subgroups. Clique model was the earliest mathematical model for a cohesive subgroup, but its overly restrictive definition motivated several relaxations including the k-plex model. The models from SNA are suitable, and potentially more realistic cluster models for graph-based clustering and data mining. This article will discuss the applicability of the k-plex model and its advantages compared to the clique model. Some recent developments in integer programming based approaches to identify large k-plexes would be described and the approaches demonstrated on a text mining network.

[1]  R. J. Mokken,et al.  Cliques, clubs and clans , 1979 .

[2]  John Scott What is social network analysis , 2010 .

[3]  R. Luce,et al.  A method of matrix analysis of group structure , 1949, Psychometrika.

[4]  John Scott Social Network Analysis , 1988 .

[5]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[6]  Sergiy Butenko,et al.  Novel Approaches for Analyzing Biological Networks , 2005, J. Comb. Optim..

[7]  J. Jeffry Howbert,et al.  The Maximum Clique Problem , 2007 .

[8]  Yehoshua Perl,et al.  Clustering and domination in perfect graphs , 1984, Discret. Appl. Math..

[9]  Steven R. Corman,et al.  Studying Complex Discursive Systems: Centering Resonance Analysis of Communication. , 2002 .

[10]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[11]  R. Luce,et al.  Connectivity and generalized cliques in sociometric group structure , 1950, Psychometrika.

[12]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.

[13]  S. S. Ravi,et al.  Heuristic and Special Case Algorithms for Dispersion Problems , 1994, Oper. Res..

[14]  kc claffy,et al.  Internet topology: connectivity of IP graphs , 2001, SPIE ITCom.

[15]  R. Alba A graph‐theoretic definition of a sociometric clique† , 1973 .

[16]  Panos M. Pardalos,et al.  Mining market data: A network approach , 2006, Comput. Oper. Res..

[17]  R. Cotterrell,et al.  The Sociological Concept of Law , 1983 .

[18]  Stephen B. Seidman,et al.  A graph‐theoretic generalization of the clique concept* , 1978 .

[19]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[20]  Lawrence B. Holder,et al.  Graph-Based Data Mining , 2000, IEEE Intell. Syst..

[21]  Gilbert Laporte,et al.  Heuristics for finding k-clubs in an undirected graph , 2000, Comput. Oper. Res..

[22]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Panos M. Pardalos,et al.  On maximum clique problems in very large graphs , 1999, External Memory Algorithms.

[24]  J. Mitchell Branch-and-Cut Algorithms for Combinatorial Optimization Problems , 1988 .

[25]  Vladimir Batagelj,et al.  Density based approaches to network analysis Analysis of Reuters terror news network , 2003 .

[26]  Linton C. Freeman,et al.  The Sociological Concept of "Group": An Empirical Test of Two Models , 1992, American Journal of Sociology.

[27]  Balabhaskar Balasundaram,et al.  Graph theoretic generalizations of clique: optimization and extensions , 2009 .

[28]  Gilbert Laporte,et al.  An exact algorithm for the maximum k-club problem in an undirected graph , 1999, Eur. J. Oper. Res..

[29]  J. Håstad Clique is hard to approximate withinn1−ε , 1999 .

[30]  Michael A. Langston,et al.  Detecting Network Motifs in Gene Co-expression Networks Through Integration of Protein Domain Information , 2004 .