MA-LSTM: A Multi-Attention Based LSTM for Complex Pattern Extraction

With the improvement of data volume, computing power and algorithms, deep learning has achieved rapid development and showing excellent performance. Recently, many deep learning models are proposed to solve the problems in different areas. A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. This allows it to exhibit temporal dynamic behavior, which makes it applicable to tasks such as handwriting recognition or speech recognition. However, the RNN relies heavily on the automatic learning ability to update parameters that concentrate on the data flow but seldom considers the feature extraction capability of the gate mechanism. In this paper, we propose a novel architecture to build the forget gate which is generated by multiple bases. Instead of using the traditional single-layer fully-connected network, we use a Multiple Attention (MA) based network to generate the forget gate which refines the optimization space of gate function and improve the granularity of the recurrent neural network to approximate the map in the ground truth. Due to the benefit of MA structure on the gate mechanism, the proposed MA-LSTM model achieves better feature extraction capability than other known models.

