Efficient Scalable Temporal Web Graph Store

Temporal web graphs have been attracting much attention recently due to their important applications in web search, data mining, and social network analysis. Accumulated over long periods, those graphs have grown gigantic in size and rich in temporal evolution, which poses tough challenges for data storage and management. Though a few temporal graph management systems were previously proposed, none of them can simultaneously satisfy both essential requirements when retrieving on temporal web graphs: very large data scalability and very low querying latency.In this work, we address the above gap in existing works by developing a highly efficient temporal graph management system which is dedicated to web graphs. To this end, we greatly extend the most efficient framework for managing large static web graphs to handle temporal information using the property matrix while preserving most of the outstanding features of the base framework. Ultimately, our proposed system can achieve a nearly instant response for vertex-centric temporal retrieval while still being scalable to huge datasets. Experiments on a real-world dataset with more than 43B nodes and 317B links show that using a small non-dedicated cluster, our system can reach a reduction of data storage space up to 88% of raw data size and reduce the retrieval time by 20%, compared to the baselines. We also demonstrate that our system also yields a significant reduction of computational costs for many graph ranking algorithms.