论文信息 - Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting - 字舞流文

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Hui Xiong | Shanghang Zhang | Jianxin Li | Shuai Zhang | Haoyi Zhou | Wan Zhang | J. Peng

[1] Andreas Spanias,et al. Attend and Diagnose: Clinical Time Series Analysis using Attention Models , 2017, AAAI.

[2] J. Vargas-Guzmán,et al. Change of Support of Transformations: Conservation of Lognormality Revisited , 2005 .

[3] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.

[4] K. Torkkola,et al. A Multi-Horizon Quantile Recurrent Forecaster , 2017, 1711.11053.

[5] M. Romeo,et al. Broad distribution effects in sums of lognormal random variables , 2002, physics/0211065.

[6] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[7] Peng Xu,et al. Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer , 2020, AISTATS.

[8] Yisong Yue,et al. Long-term Forecasting using Higher Order Tensor RNNs , 2017 .

[9] Norman C. Beaulieu. An Extended Limit Theorem for Correlated Lognormal Sums , 2012, IEEE Transactions on Communications.

[10] Sebastian Ewert,et al. Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling , 2019, IJCAI.

[11] C. Lo. The Sum and Difference of Two Lognormal Random Variables , 2013 .

[12] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[13] Ling Yang,et al. DSTP-RNN: a dual-stage two-phase attention-based recurrent neural networks for long-term and multivariate time series prediction , 2019, Expert Syst. Appl..

[14] Makoto Yamada,et al. Transformer Dissection: An Unified Understanding for Transformer's Attention via the Lens of Kernel , 2019, EMNLP/IJCNLP.

[15] Valentin Flunkert,et al. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[16] Ridha Bouallegue,et al. On the Approximation of the Sum of Lognormals by a Log Skew Normal Distribution , 2015 .

[17] Gwilym M. Jenkins,et al. Time series analysis, forecasting and control , 1971 .

[18] Omer Levy,et al. Blockwise Self-Attention for Long Document Understanding , 2020, EMNLP.

[19] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[20] Vladlen Koltun,et al. Convolutional Sequence Modeling Revisited , 2018, ICLR.

[21] D. Dufresne. SUMS OF LOGNORMALS , 2008 .

[22] Jacob Weiner,et al. The meaning and measurement of size hierarchies in plant populations , 1984, Oecologia.

[23] Matthias W. Seeger,et al. Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale , 2017, ArXiv.

[24] Aderemi Oluyinka Adewumi,et al. Stock Price Prediction Using the ARIMA Model , 2014, 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation.

[25] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[26] Quoc V. Le,et al. Learning Longer-term Dependencies in RNNs with Auxiliary Losses , 2018, ICML.

[27] Sunita Sarawagi,et al. ARMDN: Associative and Recurrent Mixture Density Networks for eRetail Demand Forecasting , 2018, ArXiv.

[28] Timothy P. Lillicrap,et al. Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.

[29] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[30] Emily B. Fox,et al. Adaptively Truncating Backpropagation Through Time to Control Gradient Bias , 2019, UAI.

[31] Richard A. Davis,et al. Time Series: Theory and Methods , 2013 .

[32] Christos Faloutsos,et al. FUNNEL: automatic mining of spatially coevolving epidemics , 2014, KDD.

[33] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.

[34] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[35] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[36] Shih-Fu Chang,et al. CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation , 2019, ArXiv.

[37] Guokun Lai,et al. Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[38] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[39] Yisong Yue,et al. Long-term Forecasting using Tensor-Train RNNs , 2017, ArXiv.

[40] P. Young,et al. Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[41] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[42] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[43] Matthias W. Seeger,et al. Bayesian Intermittent Demand Forecasting for Large Inventories , 2016, NIPS.

[44] Thomas A. Funkhouser,et al. Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.

[46] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.

[47] Shou-De Lin,et al. A Memory-Network Based Solution for Multivariate Time-Series Forecasting , 2018, ArXiv.

[48] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[49] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.

[50] Philip S. Yu,et al. Optimal multi-scale patterns in time series streams , 2006, SIGMOD Conference.

[51] Giuseppe Carlo Calafiore,et al. Log-Sum-Exp Neural Networks and Posynomial Models for Convex and Log-Log-Convex Data , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[52] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[53] Dennis Shasha,et al. StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[54] Benjamin Letham,et al. Forecasting at Scale , 2018, PeerJ Prepr..

[55] Alexander M. Rush,et al. Dilated Convolutions for Modeling Long-Distance Genomic Dependencies , 2017, bioRxiv.

[56] Garrison W. Cottrell,et al. A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction , 2017, IJCAI.

[57] Cyrus Shahabi,et al. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting , 2017, ICLR.

[58] Wenhu Chen,et al. Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting , 2019, NeurIPS.