Coloring large complex networks

Given a large social or information network, how can we partition the vertices into sets (i.e., colors) such that no two vertices linked by an edge are in the same set while minimizing the number of sets used. Despite the obvious practical importance of graph coloring, existing works have not systematically investigated or designed methods for large complex networks. In this work, we develop a unified framework for coloring large complex networks that consists of two main coloring variants that effectively balances the tradeoff between accuracy and efficiency. Using this framework as a fundamental basis, we propose coloring methods designed for the scale and structure of complex networks. In particular, the methods leverage triangles, triangle-cores, and other egonet properties and their combinations. We systematically compare the proposed methods across a wide range of networks (e.g., social, web, biological networks) and find a significant improvement over previous approaches in nearly all cases. Additionally, the solutions obtained are nearly optimal and sometimes provably optimal for certain classes of graphs (e.g., collaboration networks). We also propose a parallel algorithm for the problem of coloring neighborhood subgraphs and make several key observations. Overall, the coloring methods are shown to be (1) accurate with solutions close to optimal, (2) fast and scalable for large networks, and (3) flexible for use in a variety of applications.

[1]  M. Kearns,et al.  An Experimental Study of the Coloring Problem on Human Subject Networks , 2006, Science.

[2]  Janez Konc,et al.  An improved branch and bound algorithm for the maximum clique problem , 2007 .

[3]  Philip S. Yu,et al.  Outlier detection in graph streams , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[4]  Mohammad Al Hasan,et al.  GRAFT: an approximate graphlet counting algorithm for large graph analysis , 2012, CIKM.

[5]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[6]  J. J. Moré,et al.  Estimation of sparse jacobian matrices and graph coloring problems , 1983 .

[7]  Pablo San Segundo,et al.  An exact bit-parallel algorithm for the maximum clique problem , 2011, Comput. Oper. Res..

[8]  P. Pardalos,et al.  An exact algorithm for the maximum clique problem , 1990 .

[9]  Ryan A. Rossi Fast Triangle Core Decomposition for Mining Large Graphs , 2014, PAKDD.

[10]  Etsuji Tomita,et al.  An Efficient Branch-and-bound Algorithm for Finding a Maximum Clique with Computational Experiments , 2001, J. Glob. Optim..

[11]  Martin G. Everett,et al.  Role colouring a graph , 1991 .

[12]  Biswanath Mukherjee,et al.  A Practical Approach for Routing and Wavelength Assignment in Large Wavelength-Routed Optical Networks , 1996, IEEE J. Sel. Areas Commun..

[13]  Luc De Raedt,et al.  Probabilistic Inductive Logic Programming , 2004, Probabilistic Inductive Logic Programming.

[14]  Ramana Rao Kompella,et al.  Graph sample and hold: a framework for big-graph analytics , 2014, KDD.

[15]  C. Colbourn,et al.  Handbook of Combinatorial Designs , 2006 .

[16]  Gregory J. Chaitin,et al.  Register allocation & spilling via graph coloring , 1982, SIGPLAN '82.

[17]  Ryan A. Rossi,et al.  A Fast Parallel Maximum Clique Algorithm for Large Sparse Graphs and Temporal Strong Components , 2013, ArXiv.

[18]  Gordon F. Royle,et al.  Algebraic Graph Theory , 2001, Graduate texts in mathematics.

[19]  Jimeng Sun,et al.  Two heads better than one: pattern discovery in time-evolving multi-aspect data , 2008, Data Mining and Knowledge Discovery.

[20]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[21]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[22]  R.J. McEliece,et al.  Channel assignment in cellular radio , 1989, IEEE 39th Vehicular Technology Conference.

[23]  Lisa Singh,et al.  Stability vs. Diversity: Understanding the Dynamics of Actors in Time-Varying Affiliation Networks , 2012, 2012 International Conference on Social Informatics.

[24]  Ciaran McCreesh,et al.  Multi-Threading a State-of-the-Art Maximum Clique Algorithm , 2013, Algorithms.

[25]  Mohammad Al Hasan,et al.  An integrated, generic approach to pattern mining: data mining template library , 2008, Data Mining and Knowledge Discovery.

[26]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[27]  Ben Y. Zhao,et al.  User interactions in social networks and their implications , 2009, EuroSys '09.

[28]  Minas Gjoka,et al.  Walking in Facebook: A Case Study of Unbiased Sampling of OSNs , 2010, 2010 Proceedings IEEE INFOCOM.

[29]  Etsuji Tomita,et al.  An efficient branch-and-bound algorithm for finding a maximum clique with computational experiments , 2009, J. Glob. Optim..

[30]  Jian Ni,et al.  Coloring Spatial Point Processes With Applications to Peer Discovery in Large Wireless Networks , 2011, IEEE/ACM Transactions on Networking.

[31]  Elchanan Mossel,et al.  Reaching Consensus on Social Networks , 2010, ICS.

[32]  Shilpa Chakravartula,et al.  Complex Networks: Structure and Dynamics , 2014 .

[33]  Donald F. Towsley,et al.  Secret communication in large wireless networks without eavesdropper location information , 2012, 2012 Proceedings IEEE INFOCOM.

[34]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Patrick Prosser,et al.  Exact Algorithms for Maximum Clique: A Computational Study , 2012, Algorithms.

[36]  Lise Getoor,et al.  Active Learning for Networked Data , 2010, ICML.

[37]  Alex Pothen,et al.  ColPack: Software for graph coloring and related problems in scientific computing , 2013, TOMS.

[38]  Srinivasan Parthasarathy,et al.  Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[39]  Lada A. Adamic,et al.  Information Diffusion in Computer Science Citation Networks , 2009, ICWSM.

[40]  Raymond Chi-Wing Wong,et al.  Hop Doubling Label Indexing for Point-to-Point Distance Querying on Scale-Free Networks , 2014, Proc. VLDB Endow..

[41]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[42]  Srinivasan Parthasarathy,et al.  Efficient community detection in large networks using content and links , 2012, WWW.

[43]  Mohammad Al Hasan,et al.  MUSK: Uniform Sampling of k Maximal Patterns , 2009, SDM.

[44]  Minas Gjoka,et al.  Estimating clique composition and size distributions from sampled network data , 2013, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[45]  Roger Wattenhofer,et al.  Distributed Coloring Depending on the Chromatic Number or the Neighborhood Growth , 2011, SIROCCO.

[46]  Christos Faloutsos,et al.  Spectral Analysis for Billion-Scale Graphs: Discoveries and Implementation , 2011, PAKDD.

[47]  Ian Davidson,et al.  Network discovery via constrained tensor analysis of fMRI data , 2013, KDD.

[48]  Frank Thomson Leighton,et al.  A Graph Coloring Algorithm for Large Scheduling Problems. , 1979, Journal of research of the National Bureau of Standards.

[49]  Martin G. Everett,et al.  Ego‐centered and local roles: A graph theoretic approach , 1990 .

[50]  Vojtech Rödl,et al.  Coloring graphs with locally few colors , 1986, Discret. Math..

[51]  P. Erdos,et al.  On chromatic number of graphs and set-systems , 1966 .

[52]  Manali Sharma,et al.  Most-Surely vs. Least-Surely Uncertain , 2013, 2013 IEEE 13th International Conference on Data Mining.

[53]  Tatsuya Akutsu,et al.  Efficient Algorithms for Finding Maximum and Maximal Cliques: Effective Tools for Bioinformatics , 2011 .

[54]  Danai Koutra,et al.  Network similarity via multiple social theories , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[55]  Jonathan Cohen,et al.  Graph Twiddling in a MapReduce World , 2009, Computing in Science & Engineering.

[56]  Jon M. Kleinberg,et al.  Graph cluster randomization: network exposure to multiple universes , 2013, KDD.

[57]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[58]  J. Jeffry Howbert,et al.  The Maximum Clique Problem , 2007 .

[59]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[60]  George Karypis,et al.  Multi-threaded Graph Partitioning , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[61]  Christos Faloutsos,et al.  Fast Robustness Estimation in Large Social Graphs: Communities and Anomaly Detection , 2012, SDM.

[62]  Jon M. Kleinberg,et al.  Subgraph frequencies: mapping the empirical and extremal geography of large graph collections , 2013, WWW.

[63]  S. Thomas McCormick,et al.  Optimal approximation of sparse hessians and its equivalence to a graph coloring problem , 1983, Math. Program..

[64]  Shinya Takahashi,et al.  A Simple and Faster Branch-and-Bound Algorithm for Finding a Maximum Clique , 2010, WALCOM.

[65]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[66]  P. Erdos,et al.  On chromatic number of graphs and set-systems , 1966 .

[67]  D. J. A. Welsh,et al.  An upper bound for the chromatic number of a graph and its application to timetabling problems , 1967, Comput. J..

[68]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[69]  Gregory J. Chaitin,et al.  Register allocation and spilling via graph coloring , 2004, SIGP.

[70]  Ramana Rao Kompella,et al.  Network Sampling: From Static to Streaming Graphs , 2012, TKDD.

[71]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[72]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[73]  Kok Wai Wong,et al.  A pure graph coloring constructive heuristic in timetabling , 2012, 2012 International Conference on Computer & Information Science (ICCIS).

[74]  Ramamohan Paturi,et al.  Does more connectivity help groups to solve social problems , 2011, EC '11.

[75]  Lada A. Adamic,et al.  Search in Power-Law Networks , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[76]  Fan Chung Graham,et al.  A Network Coloring Game , 2008, WINE.

[77]  Kristian Kersting,et al.  Dimension Reduction via Colour Refinement , 2013, ESA.

[78]  Leland L. Beck,et al.  Smallest-last ordering and clustering and graph coloring algorithms , 1983, JACM.

[79]  Ian Davidson,et al.  Active Spectral Clustering , 2010, 2010 IEEE International Conference on Data Mining.

[80]  G. Szekeres,et al.  An inequality for the chromatic number of a graph , 1968 .

[81]  Ryan A. Rossi,et al.  Fast maximum clique algorithms for large graphs , 2014, WWW.

[82]  Roger Wattenhofer,et al.  Coloring unstructured radio networks , 2005, SPAA '05.