On Recurrent Neural Networks for Sequence-based Processing in Communications

In this work, we analyze the capabilities and practical limitations of neural networks (NNs) for sequence-based signal processing which can be seen as an omnipresent property in almost any modern communication systems. In particular, we train multiple state-of-the-art recurrent neural network (RNN) structures to learn how to decode convolutional codes allowing a clear benchmarking with the corresponding maximum likelihood (ML) Viterbi decoder. We examine the decoding performance for various kinds of NN architectures, beginning with classical types like feedforward layers and gated recurrent unit (GRU)-layers, up to more recently introduced architectures such as temporal convolutional networks (TCNs) and differentiable neural computers (DNCs) with external memory. As a key limitation, it turns out that the training complexity increases exponentially with the length of the encoding memory ν and, thus, practically limits the achievable bit error rate (BER) performance. To overcome this limitation, we introduce a new training-method by gradually increasing the number of ones within the training sequences, i.e., we constrain the amount of possible training sequences in the beginning until first convergence. By consecutively adding more and more possible sequences to the training set, we finally achieve training success in cases that did not converge before via naive training. Further, we show that our network can learn to jointly detect and decode a quadrature phase shift keying (QPSK) modulated code with sub-optimal (anti-Gray) labeling in one-shot at a performance that would require iterations between demapper and decoder in classic detection schemes.

[1]  Jakob Hoydis,et al.  An Introduction to Deep Learning for the Physical Layer , 2017, IEEE Transactions on Cognitive Communications and Networking.

[2]  Pablo Piantanida,et al.  Optimal Training Channel Statistics for Neural-based Decoders , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  I. M. Onyszchuk Coding gains and error rates from the Big Viterbi Decoder , 1991 .

[5]  Daniel J. Costello,et al.  Truncation Error Probability in Viterbi Decoding , 1977, IEEE Transactions on Communications.

[6]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[7]  Sreeram Kannan,et al.  DEEPTURBO: Deep Turbo Decoder , 2019, 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[8]  Sreeram Kannan,et al.  LEARN Codes: Inventing Low-Latency Codes via Recurrent Neural Networks , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[9]  Himanshu Asnani,et al.  MIND: Model Independent Neural Decoder , 2019, 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[10]  S. Brink,et al.  Iterative demapping for QPSK modulation , 1998 .

[11]  G. David Forney,et al.  Convolutional Codes II. Maximum-Likelihood Decoding , 1974, Inf. Control..

[12]  Huazi Zhang,et al.  Performance Evaluation of Channel Decoding with Deep Neural Networks , 2017, 2018 IEEE International Conference on Communications (ICC).

[13]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[14]  Yair Be'ery,et al.  Learning to decode linear codes using deep learning , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[15]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Sreeram Kannan,et al.  Communication Algorithms via Deep Learning , 2018, ICLR.

[17]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[18]  Andrea J. Goldsmith,et al.  Detection Algorithms for Communication Systems Using Deep Learning , 2017, ArXiv.

[19]  Vladlen Koltun,et al.  Trellis Networks for Sequence Modeling , 2018, ICLR.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Ami Wiesel,et al.  Deep MIMO detection , 2017, 2017 IEEE 18th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[22]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[23]  Andrea Goldsmith,et al.  Neural Network Detection of Data Sequences in Communication Systems , 2018, IEEE Transactions on Signal Processing.

[24]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.