论文信息 - Attention and Memory-Augmented Networks for Dual-View Sequential Learning - 字舞流文

Attention and Memory-Augmented Networks for Dual-View Sequential Learning

In recent years, sequential learning has been of great interest due to the advance of deep learning with applications in time-series forecasting, natural language processing, and speech recognition. Recurrent neural networks (RNNs) have achieved superior performance in single-view and synchronous multi-view sequential learning comparing to traditional machine learning models. However, the method remains less explored in asynchronous multi-view sequential learning, and the unalignment nature of multiple sequences poses a great challenge to learn the inter-view interactions. We develop an AMANet (Attention and Memory-Augmented Networks) architecture by integrating both attention and memory to solve asynchronous multi-view learning problem in general, and we focus on experiments in dual-view sequences in this paper. Self-attention and inter-attention are employed to capture intra-view interaction and inter-view interaction, respectively. History attention memory is designed to store the historical information of a specific object, which serves as local knowledge storage. Dynamic external memory is used to store global knowledge for each view. We evaluate our model in three tasks: medication recommendation from a patient's medical records, diagnosis-related group (DRG) classification from a hospital record, and invoice fraud detection through a company's taxation behaviors. The results demonstrate that our model outperforms all baselines and other state-of-the-art models in all tasks. Moreover, the ablation study of our model indicates that the inter-attention mechanism plays a key role in the model and it can boost the predictive power by effectively capturing the inter-view interactions from asynchronous views.

Cheng Wang | Yong He | Nan Li | Zhenyu Zeng | Zhenyu Zeng | Nan Li | Cheng Wang | Yong He

[1] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[3] Fenglong Ma,et al. Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks , 2017, KDD.

[4] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[5] Jimeng Sun,et al. LEAP: Learning to Prescribe Effective and Safe Treatment Combinations for Multimorbidity , 2017, KDD.

[6] Erik Cambria,et al. Memory Fusion Network for Multi-view Sequential Learning , 2018, AAAI.

[7] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[8] Roland Göcke,et al. Extending Long Short-Term Memory for Multi-View Structured Learning , 2016, ECCV.

[9] Peter Szolovits,et al. MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[11] Jason Weston,et al. Memory Networks , 2014, ICLR.

[12] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[13] XuXin,et al. Multi-view learning overview , 2017 .

[14] Jimeng Sun,et al. GAMENet: Graph Augmented MEmory Networks for Recommending Medication Combination , 2018, AAAI.

[15] Andreas Spanias,et al. Attend and Diagnose: Clinical Time Series Analysis using Attention Models , 2017, AAAI.

[16] Shiliang Sun,et al. Multi-view learning overview: Recent progress and new challenges , 2017, Inf. Fusion.

[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[18] Louis-Philippe Morency,et al. Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Ping Li,et al. Exploring the transition to DRGs in Developing Countries: A case study in Shanghai, China , 1969, Pakistan journal of medical sciences.

[20] Haoyu Yang,et al. A Treatment Engine by Predicting Next-Period Prescriptions , 2018, KDD.

[21] Jimeng Sun,et al. RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data , 2018, KDD.

[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[24] Rema Padman,et al. Machine Learning Approaches for Early DRG Classification and Resource Allocation , 2015, INFORMS J. Comput..

[25] Dacheng Tao,et al. A Survey on Multi-view Learning , 2013, ArXiv.