An application specific instruction set processor based implementation for signal detection in multiple antenna systems

In comparison to single antenna systems, a wireless multiple-input multiple-output (MIMO) system provides higher throughput at no additional cost of bandwidth, but the high complexity of the detection algorithms poses a major challenge to the hardware implementation. Maximum likelihood (ML) MIMO detection guarantees optimal performance but implies huge processing complexity, which makes acceptable this approach only when the number of transmitting antennas is low and the adopted modulation scheme has a small cardinality. Sphere decoding (SD) is an efficient method that significantly reduces the average processing complexity with no performance penalty. Most of known sphere decoders have been implemented as application specific integrated circuits (ASICs), but the need for high degree of flexibility in MIMO detection makes interesting also application specific instruction set processor (ASIP) implementations. A single programmable ASIP can hardly reach the same processing speed as a fully dedicated ASIC; thus, parallel architectures with multiple concurrent ASIPs must be conceived to guarantee sufficient data throughput. The objective of this paper is to present a new ASIP-based implementation for the detection of MIMO signals. The processor supports multiple lattice modulation schemes (up to 64-QAM) and up to four transmitting antennas and it is able to run both ML and close to ML algorithms. A parallel architecture has been also designed with multiple ASIPs, which concurrently execute the detection algorithm on received symbols, a central unit acting as task scheduler, and a buffer for the compensation of non constant throughput. A dedicated bus handles the communication among allocated units. Each ASIP occupies a silicon area of 0.093mm^2 and runs at 400MHz when implemented on a 90nm CMOS technology. Achievable throughput depends on the adopted MIMO system and on the number of allocated ASIPs: a detector with 10ASIPs programmed to run ML detection on a 4x4 MIMO system with 64-QAM modulation offers a throughput of 78Mbps at signal-to-noise ratio SNR=18dB.

[1]  Chester Sungchung Park,et al.  A pipelined VLSI architecture for a list sphere decoder , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[3]  Markku J. Juntti,et al.  Application-Specific Instruction Set Processor Implementation of List Sphere Detector , 2007, 2007 Conference Record of the Forty-First Asilomar Conference on Signals, Systems and Computers.

[4]  Giuseppe Caire,et al.  On maximum-likelihood detection and the search for the closest lattice point , 2003, IEEE Trans. Inf. Theory.

[5]  Babak Daneshrad,et al.  VLSI implementation of a quasi-ml, energy efficient fixed complexity sphere decoder for MIMO communication system , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[6]  Helmut Bölcskei,et al.  Soft-output sphere decoding: algorithms and VLSI implementation , 2008, IEEE Journal on Selected Areas in Communications.

[7]  Markku J. Juntti,et al.  A GPU implementation for two MIMO-OFDM detectors , 2010, 2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[8]  Micaela Troglia Gamba,et al.  Look-ahead sphere decoding: algorithm and VLSI architecture , 2011, IET Commun..

[9]  Markku J. Juntti,et al.  Fixed- and Floating-Point Processor Comparison for MIMO-OFDM Detector , 2011, IEEE Journal of Selected Topics in Signal Processing.

[10]  Joseph R. Cavallaro,et al.  Implementation of a High Throughput Soft MIMO Detector on GPU , 2011, J. Signal Process. Syst..

[11]  A. Burg,et al.  VLSI implementation of MIMO detection using the sphere decoding algorithm , 2005, IEEE Journal of Solid-State Circuits.

[12]  Joseph R. Cavallaro,et al.  Performance—Complexity Comparison of Receivers for a LTE MIMO–OFDM System , 2010, IEEE Transactions on Signal Processing.

[13]  Gwan S. Choi,et al.  Systolic like soft-detection architecture for 4×4 64-QAM MIMO system , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[14]  Alexander Vardy,et al.  Closest point search in lattices , 2002, IEEE Trans. Inf. Theory.

[15]  Claus-Peter Schnorr,et al.  Lattice basis reduction: Improved practical algorithms and solving subset sum problems , 1991, FCT.

[16]  Gerhard Fettweis,et al.  Optimal LLR Clipping Levels for Mixed Hard/Soft Output Detection , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[17]  Seungbeom Lee,et al.  VLSI Implementation of Area-efficient List Sphere Decoder , 2006, 2006 International Conference on Communication Technology.

[18]  Tzi-Dar Chiueh,et al.  A 74.8 mW Soft-Output Detector IC for 8 $\,\times\,$8 Spatial-Multiplexing MIMO Communications , 2010, IEEE Journal of Solid-State Circuits.

[19]  Joseph R. Cavallaro,et al.  Implementation Aspects of List Sphere Detector Algorithms , 2007, IEEE GLOBECOM 2007 - IEEE Global Telecommunications Conference.

[20]  Rohit U. Nabar,et al.  Introduction to Space-Time Wireless Communications , 2003 .

[21]  Sergio Benedetto,et al.  Digital Transmission Theory , 1987 .

[22]  Hugo De Man,et al.  Instruction set definition and instruction selection for ASIPs , 1994, Proceedings of 7th International Symposium on High-Level Synthesis.

[23]  Stephan ten Brink,et al.  Achieving near-capacity on a multiple-antenna channel , 2003, IEEE Trans. Commun..

[24]  Babak Hassibi,et al.  On the expected complexity of integer least-squares problems , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[25]  Guido Masera,et al.  A Novel VLSI Architecture of Fixed-Complexity Sphere Decoder , 2010, 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools.

[26]  Emanuele Viterbo,et al.  A universal lattice code decoder for fading channels , 1999, IEEE Trans. Inf. Theory.

[27]  Guido Masera,et al.  Decoding the Golden Code: A VLSI Design , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[28]  Markku J. Juntti,et al.  Fine-grained application-specific instruction set processor design for the K-best list sphere detector algorithm , 2008, 2008 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation.