L-Graph: A General Graph Analytic System on Continuous Computation

Massive graph analytics have become an important aspect of multiple diverse applications. With the growing scale of real world graphs, efficient execution of entire graph analytics has become a challenging problem. Recently a number of distributed graph processing systems (Pregel, PowerGraph, Trinity) and centralized systems (GraphChi and XStream) have been designed. Compared with high expense of distributed systems deployed on a cluster of commodity machines, the centralized systems on cheap PCs are very attractive propositions with low expense and comparable performance. By careful analysis, we find that (i) the graph computation abstraction in the centralized systems inherently adopted a batch model similar to the distributed systems. The batch model could lead to suboptimal performance. (ii) The execution model in the centralized systems advocates sequential operations on Solid State Disk (SSD) which are still slower than memory-based operations. In order to tackle the above efficiency issues in centralized systems, we first propose a novel continuous graph computation abstraction. This model continuously processes edges and updates computation results. It allows much faster convergence than the batch model. Second, we propose to maintain vertex states in memory and advocates memory-based operations for much faster I/O operations than sequential operations on SSD. Finally, we design an adaptive memory layout to minimize overall I/O cost. We develop a proof of concept prototype L-Graph and implement four example graph analytic applications atop L-Graph. Preliminary evaluation on real and synthetic graphs have verified that the proposed continuous model greatly performs the widely used batch model and L-Graph can achiever much higher efficiency than the state of arts GraphChi.