Accelerating Network Coding on Many-core GPUs and Multi-core CPUs

Network coding has recently been widely applied in various distributed systems for throughput improvement and/or resilience to network dynamics. However, the computational overhead introduced by network coding operations is not negligible and has become the obstacle for practical deployment of network coding. In this paper, we exploit the computing power of commodity many-core Graphic Processing Units (GPUs) and multi-core CPUs to accelerate the network coding operations. We propose a set of parallel algorithms that maximize the parallelism of the encoding and decoding processes and fully utilize the power of GPUs. This paper also shares our optimization design choices and our workarounds to the challenges encountered in working with GPUs. With our implementation of the algorithms, we are able to achieve significant speedup over existing solutions on CPUs.

[1]  Baochun Li,et al.  Parallelized Progressive Network Coding With Hardware Acceleration , 2007, 2007 Fifteenth IEEE International Workshop on Quality of Service.

[2]  S.A. Manavski,et al.  CUDA Compatible GPU as an Efficient Hardware Accelerator for AES Cryptography , 2007, 2007 IEEE International Conference on Signal Processing and Communications.

[3]  Edward T. Grochowski,et al.  Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[4]  Rudolf Ahlswede,et al.  Network information flow , 2000, IEEE Trans. Inf. Theory.

[5]  R. Koetter,et al.  The benefits of coding over routing in a randomized setting , 2003, IEEE International Symposium on Information Theory, 2003. Proceedings..

[6]  Leonel Sousa,et al.  Massive parallel LDPC decoding on GPU , 2008, PPoPP.

[7]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[8]  Baochun Li,et al.  Lava: A Reality Check of Network Coding in Peer-to-Peer Live Streaming , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[9]  Peter Sanders,et al.  Polynomial time algorithms for network information flow , 2003, SPAA '03.

[10]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[11]  Wen-mei W. Hwu,et al.  Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[12]  K. Jain,et al.  Practical Network Coding , 2003 .

[13]  Christos Gkantsidis,et al.  Network coding for large scale content distribution , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[14]  Baochun Li,et al.  R2: Random Push with Random Network Coding in Live Peer-to-Peer Streaming , 2007, IEEE Journal on Selected Areas in Communications.

[15]  Muriel Medard,et al.  On Randomized Network Coding , 2003 .

[16]  Xin Wang,et al.  Nuclei: GPU-Accelerated Many-Core Network Coding , 2009, IEEE INFOCOM 2009.

[17]  Muriel Médard,et al.  An algebraic approach to network coding , 2003, TNET.

[18]  Xiaowen Chu,et al.  Massively Parallel Network Coding on GPUs , 2008, 2008 IEEE International Performance, Computing and Communications Conference.

[19]  Xiaowen Chu,et al.  Practical Random Linear Network Coding on GPUs , 2009, Networking.

[20]  Anjul Patney,et al.  Efficient computation of sum-products on GPUs through software-managed cache , 2008, ICS '08.

[21]  T. Ho,et al.  On Linear Network Coding , 2010 .