In-Memory Big Graph: A Future Research Agenda

With the growth of the inter-connectivity of the world, Big Graph has become a popular emerging technology. For instance, social media (Facebook, Twitter). Prominent examples of Big Graph include social networks, biological network, graph mining, big knowledge graph, big web graphs and scholarly citation networks. A Big Graph consists of millions of nodes and trillion of edges. Big Graphs are growing exponentially and requires large computing machinery. Big Graph is posing many issues such as storage, scalability, processing and many more. This paper gives a brief overview of in-memory Big Graph Systems and some key challenges. Also, sheds some light on future research agendas of in-memory systems.

[1]  Justin Chu,et al.  ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter , 2016, bioRxiv.

[2]  Pararth Shah,et al.  Ringo: Interactive Graph Analytics on Big-Memory Machines , 2015, SIGMOD Conference.

[3]  Rares Vernica,et al.  Hyracks: A flexible and extensible foundation for data-intensive computing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[4]  Dino Pedreschi,et al.  Human mobility, social ties, and link prediction , 2011, KDD.

[5]  Michael J. Carey,et al.  Pregelix: Big(ger) Graph Analytics on a Dataflow Engine , 2014, Proc. VLDB Endow..

[6]  Ye Yuan,et al.  Big graph classification frameworks based on Extreme Learning Machine , 2019, Neurocomputing.

[7]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[8]  Shirish Tatikonda,et al.  From "Think Like a Vertex" to "Think Like a Graph" , 2013, Proc. VLDB Endow..

[9]  Haixun Wang,et al.  Trinity: a distributed graph engine on a memory cloud , 2013, SIGMOD '13.

[10]  Ripon Patgiri,et al.  Dr. Hadoop: an infinite scalable metadata management for Hadoop—How the baby elephant becomes immortal , 2016, Frontiers of Information Technology & Electronic Engineering.

[11]  Yonggang Wen,et al.  GraphMP: An Efficient Semi-External-Memory Big Graph Processing System on a Single Machine , 2017, 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS).

[12]  Reynold Xin,et al.  GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.

[13]  C. Priebe,et al.  Semi-External Memory Sparse Matrix Multiplication for Billion-Node Graphs , 2016, IEEE Transactions on Parallel and Distributed Systems.

[14]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[15]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[16]  Ching-Yung Lin,et al.  GraphBIG: understanding graph computing in the context of industrial solutions , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[17]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[18]  Zhanxing Zhu,et al.  Spatio-temporal Graph Convolutional Neural Network: A Deep Learning Framework for Traffic Forecasting , 2017, IJCAI.

[19]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[20]  Philip S. Yu,et al.  Graph OLAP: Towards Online Analytical Processing on Graphs , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[21]  Michael A. Bender,et al.  deBGR: an efficient and near-exact representation of the weighted de Bruijn graph , 2017, Bioinform..

[22]  Ripon Patgiri,et al.  Dr. Hadoop: In Search of a Needle in a Haystack , 2019, ICDCIT.

[23]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[24]  Jimmy J. Lin,et al.  Information network or social network?: the structure of the twitter follow graph , 2014, WWW.

[25]  Yonggang Wen,et al.  GraphH: High Performance Big Graph Analytics in Small Clusters , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[26]  Gregory Kucherov,et al.  Using cascading Bloom filters to improve the memory usage for de Brujin graphs , 2013, Algorithms for Molecular Biology.

[27]  Yuanyuan Tian,et al.  Big Graph Analytics Systems , 2016, SIGMOD Conference.

[28]  Muhammad Kamran Siddiqui,et al.  Study of biological networks using graph theory , 2017, Saudi journal of biological sciences.

[29]  Shigeng Zhang,et al.  Energy-Aware Temporal Reachability Graphs for Time-Varying Mobile Opportunistic Networks , 2018, IEEE Transactions on Vehicular Technology.

[30]  Xindong Wu,et al.  Learning on Big Graph: Label Inference and Regularization with Anchor Hierarchy , 2017, IEEE Transactions on Knowledge and Data Engineering.

[31]  Sreenivas Gollapudi,et al.  Less is more: sampling the neighborhood graph makes SALSA better and faster , 2009, WSDM '09.

[32]  James Cheng,et al.  Efficient processing of distance queries in large graphs: a vertex cover approach , 2012, SIGMOD Conference.

[33]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[34]  Jure Leskovec,et al.  Motifs in Temporal Networks , 2016, WSDM.

[35]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[36]  Alessia Saggese,et al.  Comparing performance of graph matching algorithms on huge graphs , 2020, Pattern Recognit. Lett..

[37]  Peter Sanders,et al.  Recent Advances in Graph Partitioning , 2013, Algorithm Engineering.

[38]  Sreenivas Gollapudi,et al.  Using Bloom Filters to Speed Up HITS-Like Ranking Algorithms , 2007, WAW.

[39]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.