In traditional link level simulation, multiple-input and multiple-output (MIMO) channel model is one of the most time-consuming modules. When using more realistic geometry-based channel models, it consumes more time. In this paper, we propose an efficient simulator implementation of geometry-based spatial channel model (SCM) on graphics processing unit (GPU). We first analyze the potential parallelism of the SCM module. The SCM simulation includes generating channel coefficients, generating additive white Gaussian noise (AWGN), filtering input signals and adding noise. Secondly, we implement all those parallelizable sub-modules on GPU using the open computing language (OpenCL). Then, a lot of effective GPU accelerating approaches are employed to make all those GPU functions highly optimized. The approaches include out-of-order command queue, merging data, sharing local memory and vectorization. At last, we verify our approaches on Nvidia's mid-range GPU GTX660. The experiment result shows that our newly proposed GPU implementation achieves more than 1000 times speedup compared with the implementation on traditional central processing unit (CPU). The simulation time is close to the processing time of transmitter and receiver, which makes it possible to construct a real-time channel simulator of link level for long term evolution (LTE) or LTE-advanced system and software-defined radio. As far as we know, we are the first to accelerate the SCM model on GPU. The results of this paper should have significant application value in practice.
[1]
Takuji Nishimura,et al.
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator
,
1998,
TOMC.
[2]
M. E. Muller,et al.
A Note on the Generation of Random Normal Deviates
,
1958
.
[3]
Mutsuo Saito,et al.
Variants of Mersenne Twister Suitable for Graphic Processors
,
2010,
TOMS.
[4]
Joseph R. Cavallaro,et al.
Reconfigurable real-time MIMO detector on GPU
,
2009,
2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers.
[5]
Reiner S. Thomä,et al.
Comparison of SCM, SCME, and WINNER Channel Models
,
2007,
2007 IEEE 65th Vehicular Technology Conference - VTC2007-Spring.
[6]
Joseph R. Cavallaro,et al.
Implementation of a 3GPP LTE turbo decoder accelerator on GPU
,
2010,
2010 IEEE Workshop On Signal Processing Systems.
[7]
Jaekyun Moon,et al.
Parallel LDPC decoder implementation on GPU based on unbalanced memory coalescing
,
2012,
2012 IEEE International Conference on Communications (ICC).