Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL

Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an OpenCL framework. The LDPC code is one of the most popular and strongest error correcting codes for mobile communication systems. Each step of LDPC decoding has different parallelization characteristics. In the proposed LDPC decoder, steps suitable for task-level parallelization are executed on the multi-core central processing unit (CPU), and steps suitable for data-level parallelization are processed by the graphics processing unit (GPU). To improve the performance of OpenCL kernels for LDPC decoding operations, explicit thread scheduling, vectorization, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance and high power efficiency by using heterogeneous multi-core processors on a unified computing framework.

[1]  Joseph R. Cavallaro,et al.  High throughput low latency LDPC decoding on GPU for SDR systems , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[2]  Bertrand Le Gal,et al.  High-Throughput LDPC Decoder on Low-Power Embedded Processors , 2015, IEEE Communications Letters.

[3]  Ki-Seok Chung,et al.  Parallel LDPC decoding using CUDA and OpenMP , 2011, EURASIP J. Wirel. Commun. Netw..

[4]  David R. Kaeli,et al.  Heterogeneous Computing with OpenCL - Revised OpenCL 1.2 Edition , 2012 .

[5]  Guanghui He,et al.  A memory efficient parallel layered QC-LDPC decoder for CMMB systems , 2013, Integr..

[6]  Jie Shen,et al.  Performance Traps in OpenCL for CPUs , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[7]  Ki-Seok Chung,et al.  A Parallelization Technique with Integrated Multi-Threading for Video Decoding on Multi-core Systems , 2013, KSII Trans. Internet Inf. Syst..

[8]  Janghoon Yang,et al.  Iterative Detection and ICI Cancellation for MISO-mode DVB-T2 System with Dual Carrier Frequency Offsets , 2012, KSII Trans. Internet Inf. Syst..

[9]  Qiang Wu,et al.  A parallel decoding algorithm of LDPC codes using CUDA , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[10]  Kai Zhang,et al.  A dual-rate LDPC decoder for china multimedia mobile broadcasting systems , 2010, IEEE Transactions on Consumer Electronics.

[11]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[12]  Youngho Ahn,et al.  Design of OpenCL framework for embedded multi-core processors , 2014, IEEE Transactions on Consumer Electronics.

[13]  Shu Fan,et al.  Optimize power for portable games on Ultrabook , 2012, 2012 International Conference on Energy Aware Computing.