Wideband Channelization for Software-Defined Radio via Mobile Graphics Processors

Wideband channelization is a computationally intensive task within software-defined radio (SDR). To support this task, the underlying hardware should provide high performance and allow flexible implementations. Traditional solutions use field-programmable gate arrays (FPGAs) to satisfy these requirements. While FPGAs allow for flexible implementations, realizing a FPGA implementation is a difficult and time-consuming process. On the other hand, multicore processors while more programmable, fail to satisfy performance requirements. Graphics processing units (GPUs) overcome the above limitations. However, traditional GPUs are power-hungry and can consume as much as 350 watts, making them ill-suited for many SDR environments, particularly those that are battery-powered. Here we explore the viability of low-power mobile graphics processors to simultaneously overcome the limitations of performance, flexibility, and power. Via execution profiling and performance analysis, we identify major bottlenecks in mapping the wideband channelization algorithm onto these devices and adopt several optimization techniques to achieve multiplicative speed-up over a multithreaded implementation. Overall, our approach delivers a speedup of up to 43-fold on the discrete AMD Radeon HD 6470M GPU and 27-fold on the integrated AMD Radeon HD 6480G GPU, when compared to a vectorized and multithreaded version running on the AMD A4-3300M CPU.

[1]  Naga K. Govindaraju,et al.  High performance discrete Fourier transforms on graphics processors , 2008, HiPC 2008.

[2]  Hyunseok Lee,et al.  The Next Generation Challenge for Software Defined Radio , 2007, SAMOS.

[3]  Scott A. Mahlke,et al.  Mobile Supercomputers for the Next-Generation Cell Phone , 2010, Computer.

[4]  Wen-mei W. Hwu,et al.  Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[5]  Daniel Llamocca,et al.  Separable FIR Filtering in FPGA and GPU Implementations: Energy, Performance, and Accuracy Considerations , 2011, 2011 21st International Conference on Field Programmable Logic and Applications.

[6]  Hyunseok Lee,et al.  SODA: A Low-power Architecture For Software Radio , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[7]  Linda Doyle,et al.  Reconfigurable Polyphase Filter Bank Architecture for Spectrum Sensing , 2010, ARC.

[8]  Kim M. Hazelwood,et al.  Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.

[9]  Hyunseok Lee,et al.  SODA: A High-Performance DSP Architecture for Software-Defined Radio , 2007, IEEE Micro.

[10]  Wu-chun Feng,et al.  Performance Characterization and Optimization of Atomic Operations on AMD GPUs , 2011, 2011 IEEE International Conference on Cluster Computing.

[11]  Lee Pucker CHANNELIZATION TECHNIQUES FOR SOFTWARE DEFINED RADIO , 2003 .

[12]  Chris Jesshope,et al.  A polyphase filter for GPUs and multi-core processors , 2012, Astro-HPC '12.

[13]  T. Ulversoy,et al.  Software Defined Radio: Challenges and Opportunities , 2010, IEEE Communications Surveys & Tutorials.

[14]  Yeh-Ching Chung,et al.  GPU Performance Enhancement via Communication Cost Reduction: Case Studies of Radix Sort and WSN Relay Node Placement Problem , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[15]  Ryan M. Monroe,et al.  Broad-Bandwidth FPGA-Based Digital Polyphase Spectrometer , 2012 .

[16]  Wu-chun Feng,et al.  Accelerating fast Fourier Transform for wideband channelization , 2013, 2013 IEEE International Conference on Communications (ICC).

[17]  Wu-chun Feng,et al.  Architecture-Aware Mapping and Optimization on a 1600-Core GPU , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[18]  Philippe Bekaert,et al.  Optimal Data Distribution for Versatile Finite Impulse Response Filtering on Next-Generation Graphics Hardware Using CUDA , 2009, 2009 15th International Conference on Parallel and Distributed Systems.

[19]  Gil Savir Scalable and Reconfigurable Digital Front-End for SDR Wideband Channelizer , 2006 .

[20]  Fredric J. Harris,et al.  Multirate Signal Processing for Communication Systems , 2004 .

[21]  Scott A. Mahlke,et al.  AnySP: Anytime Anywhere Anyway Signal Processing , 2010, IEEE Micro.

[22]  Ravinder David Koilpillai,et al.  Software radio issues in cellular base stations , 1999, IEEE J. Sel. Areas Commun..