Hypergraph Partitioning for Big Data Applications

Scalability is an important issue for big data management, and minimizing the query cost among multi-hosts in the Cloud is benefit to horizontal scaling. Hypergraph provides a good tool to model data and data relationships of complex networks, the typical big data applications. and partitioning a hypergraph helps to partition the query loads on several hosts. Since balanced hypergraph partitioning is an NP-hard problem, a few heuristic net-cut hypergraph partitioning algorithms have been developed. However, vertex-cut hypergraph partitioning methods would be effective than net-cut hypergraph partitioning ones. In this paper, we proposed a heuristic vertex-cut hypergraph partitioning algorithm, namely vcFM, which partition the hypergraph into balanced sub-hypergraph as required, based on the move of hyperedge. We show the feasibility of this idea, evaluate our method on the Facebook dataset with a variety of settings, and compare it against two alternative solutions. Experiment findings show that vcFM is scalable and outperforms the other two partitioners on low cutsize while retaining balanced partitions.

[1]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[2]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: applications in VLSI domain , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[3]  Domenico Saccà,et al.  Intrusion Detection with Hypergraph-Based Attack Models , 2013, GKR.

[4]  Charles M. Fiduccia,et al.  A linear-time heuristic for improving network partitions , 1988, 25 years of DAC.

[5]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[6]  Abhishek Chandra,et al.  Beyond graphs: toward scalable hypergraph analysis systems , 2014, PERV.

[7]  Amir H. Payberah,et al.  Distributed Vertex-Cut Partitioning , 2014, DAIS.

[8]  David Stein,et al.  Partitioning Social Networks for Fast Retrieval of Time-Dependent Queries , 2012, 2012 IEEE 28th International Conference on Data Engineering Workshops.

[9]  Yu Zhang,et al.  VSEP: A Distributed Algorithm for Graph Edge Partitioning , 2015, ICA3PP.

[10]  Lin Gao,et al.  A Vertex Separator-based Algorithm for Hypergraph Bipartitioning , 2014, J. Comput..

[11]  Tomasz Imielinski,et al.  Imbalanced Hypergraph Partitioning and Improvements for Consensus Clustering , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[12]  Reynold Xin,et al.  GraphX: a resilient distributed graph system on Spark , 2013, GRADES.

[13]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2012, TNET.

[14]  Cevdet Aykanat,et al.  Replicated partitioning for undirected hypergraphs , 2012, J. Parallel Distributed Comput..

[15]  Kim-Kwang Raymond Choo,et al.  Hypergraph partitioning for social networks based on information entropy modularity , 2017, J. Netw. Comput. Appl..

[16]  Cevdet Aykanat,et al.  Temporal Workload-Aware Replicated Partitioning for Social Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[17]  G. Karypis,et al.  Multilevel k-way hypergraph partitioning , 1999, Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361).

[18]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[19]  K. Selçuk Candan,et al.  SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices , 2012, Data Knowl. Eng..

[20]  Iman Saleh,et al.  Social-Network-Sourced Big Data Analytics , 2013, IEEE Internet Computing.