A 122Mb/s Turbo decoder using a mid-range GPU

Parallel implementations of Turbo decoding has been studied extensively. Traditionally, the number of parallel sub-decoders is limited to maintain acceptable code block error rate performance loss caused by the edge effect of code block division. In addition, the sub-decoders require synchronization to exchange information in the iterative process. In this paper, we propose loosening the synchronization between the sub-decoders to achieve higher utilization of parallel processor resources. Our method allows high degree of parallel processor utilization in decoding of a single code block providing a scalable software-based implementation. The proposed implementation is demonstrated using a graphics processing unit. We achieve 122.8Mbps decoding throughput using a medium range GPU, the Nvidia GTX480. This is, to the best of our knowledge, the fastest Turbo decoding throughput achieved with a GPU-based implementation.

[1]  Marilyn Wolf,et al.  Design space exploration of the turbo decoding algorithm on GPUs , 2010, CASES '10.

[2]  David Kaeli,et al.  Heterogeneous Computing with OpenCL , 2011 .

[3]  In-Cheol Park,et al.  SIMD Processor-Based Turbo Decoder Supporting Multiple Third-Generation Wireless Standards , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[4]  Joseph R. Cavallaro,et al.  Efficient hardware implementation of a highly-parallel 3GPP LTE/LTE-advance turbo decoder , 2011, Integr..

[5]  Jing Wang,et al.  High performance turbo decoder on CELL BE for WiMAX system , 2009, 2009 International Conference on Wireless Communications & Signal Processing.

[6]  Joseph R. Cavallaro,et al.  Implementation of a 3GPP LTE turbo decoder accelerator on GPU , 2010, 2010 IEEE Workshop On Signal Processing Systems.

[7]  Ingrid Verbauwhede,et al.  Turbo codes on the fixed point DSP TMS320C55x , 2000, 2000 IEEE Workshop on SiGNAL PROCESSING SYSTEMS. SiPS 2000. Design and Implementation (Cat. No.00TH8528).

[8]  Joseph R. Cavallaro,et al.  Implementation of a High Throughput 3GPP Turbo Decoder on GPU , 2011, J. Signal Process. Syst..

[9]  Nitin Chandrachoodan,et al.  GPU Implementation of a Programmable Turbo Decoder for Software Defined Radio Applications , 2012, 2012 25th International Conference on VLSI Design.

[10]  J. Vogt,et al.  Improving the max-log-MAP turbo decoder , 2000 .

[11]  Liang Zhang,et al.  Implementing and optimizing a turbo decoder on a TI TMS320C64x device , 2011, 2011 International Conference on Computational Problem-Solving (ICCP).

[12]  Jian Sun,et al.  The UMTS Turbo Code and an Efficient Decoder Implementation Suitable for Software-Defined Radios , 2001, Int. J. Wirel. Inf. Networks.

[13]  William J. Dally,et al.  The GPU Computing Era , 2010, IEEE Micro.