论文信息 - Efficient CUDA Algorithms for the Maximum Network Flow Problem

Efficient CUDA Algorithms for the Maximum Network Flow Problem

Publisher Summary This chapter presents graphical processing unit (GPU) algorithms for the maximum network flow problem. Maximum network flow is a fundamental graph theory problem with applications in many areas. Compared with data-parallel problems that have been deployed onto GPUs, the maximum network flow problem is more challenging for GPUs owing to intensive data and control dependencies. Two GPU-based maximum flow algorithms are presented in this chapter—the first one is asynchronous and lock free, whereas the second one is synchronized through the precoloring technique. The first algorithm solves the maximum flow problem by using atomic operations to perform the push and relabel operations asynchronously. The second algorithm works on precolored graphs and avoids race condition through barriers. Experiments using the NVIDIA C1060 GPU show that, despite the intrinsic challenges of data dependencies and divergent execution paths, both algorithms are able to achieve at least 3 times, and up to 8 times, speed-ups over implementations on a quad-core Intel Xeon CPU. It also demonstrates algorithm design and implementation, GPUs are also capable of accelerating intrinsically data-dependent problems.

Zhengyu He | Bo Hong | Jiadong Wu

[1] Richard J. Anderson,et al. On the parallel implementation of Goldberg's maximum flow algorithm , 1992, SPAA '92.

[2] Andrew V. Goldberg,et al. Recent Developments in Maximum Flow Algorithms (Invited Lecture) , 1998, SWAT.

[3] Zhengyu He,et al. An Asynchronous Multithreaded Algorithm for the Maximum Network Flow Problem with Nonblocking Global Relabeling Heuristic , 2011, IEEE Transactions on Parallel and Distributed Systems.

[4] E. A. Dinic. Algorithm for solution of a problem of maximal flow in a network with power estimation , 1970 .

[5] David A. Bader,et al. A Cache-Aware Parallel Implementation of the Push-Relabel Network Flow Algorithm and Experimental Evaluation of the Gap Relabeling Heuristic , 2006, PDCS.

[6] Wu-chun Feng,et al. Inter-block GPU communication via fast barrier synchronization , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[7] Richard M. Karp,et al. Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems , 1972, Combinatorial Optimization.