Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
暂无分享,去创建一个
Hui Xiong | Shanghang Zhang | Jianxin Li | Shuai Zhang | Haoyi Zhou | Wan Zhang | J. Peng
[1] Andreas Spanias,et al. Attend and Diagnose: Clinical Time Series Analysis using Attention Models , 2017, AAAI.
[2] J. Vargas-Guzmán,et al. Change of Support of Transformations: Conservation of Lognormality Revisited , 2005 .
[3] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[4] K. Torkkola,et al. A Multi-Horizon Quantile Recurrent Forecaster , 2017, 1711.11053.
[5] M. Romeo,et al. Broad distribution effects in sums of lognormal random variables , 2002, physics/0211065.
[6] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[7] Peng Xu,et al. Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer , 2020, AISTATS.
[8] Yisong Yue,et al. Long-term Forecasting using Higher Order Tensor RNNs , 2017 .
[9] Norman C. Beaulieu. An Extended Limit Theorem for Correlated Lognormal Sums , 2012, IEEE Transactions on Communications.
[10] Sebastian Ewert,et al. Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling , 2019, IJCAI.
[11] C. Lo. The Sum and Difference of Two Lognormal Random Variables , 2013 .
[12] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[13] Ling Yang,et al. DSTP-RNN: a dual-stage two-phase attention-based recurrent neural networks for long-term and multivariate time series prediction , 2019, Expert Syst. Appl..
[14] Makoto Yamada,et al. Transformer Dissection: An Unified Understanding for Transformer's Attention via the Lens of Kernel , 2019, EMNLP/IJCNLP.
[15] Valentin Flunkert,et al. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.
[16] Ridha Bouallegue,et al. On the Approximation of the Sum of Lognormals by a Log Skew Normal Distribution , 2015 .
[17] Gwilym M. Jenkins,et al. Time series analysis, forecasting and control , 1971 .
[18] Omer Levy,et al. Blockwise Self-Attention for Long Document Understanding , 2020, EMNLP.
[19] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[20] Vladlen Koltun,et al. Convolutional Sequence Modeling Revisited , 2018, ICLR.
[21] D. Dufresne. SUMS OF LOGNORMALS , 2008 .
[22] Jacob Weiner,et al. The meaning and measurement of size hierarchies in plant populations , 1984, Oecologia.
[23] Matthias W. Seeger,et al. Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale , 2017, ArXiv.
[24] Aderemi Oluyinka Adewumi,et al. Stock Price Prediction Using the ARIMA Model , 2014, 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation.
[25] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[26] Quoc V. Le,et al. Learning Longer-term Dependencies in RNNs with Auxiliary Losses , 2018, ICML.
[27] Sunita Sarawagi,et al. ARMDN: Associative and Recurrent Mixture Density Networks for eRetail Demand Forecasting , 2018, ArXiv.
[28] Timothy P. Lillicrap,et al. Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.
[29] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[30] Emily B. Fox,et al. Adaptively Truncating Backpropagation Through Time to Control Gradient Bias , 2019, UAI.
[31] Richard A. Davis,et al. Time Series: Theory and Methods , 2013 .
[32] Christos Faloutsos,et al. FUNNEL: automatic mining of spatially coevolving epidemics , 2014, KDD.
[33] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[34] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[35] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[36] Shih-Fu Chang,et al. CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation , 2019, ArXiv.
[37] Guokun Lai,et al. Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.
[38] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[39] Yisong Yue,et al. Long-term Forecasting using Tensor-Train RNNs , 2017, ArXiv.
[40] P. Young,et al. Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.
[41] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[42] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[43] Matthias W. Seeger,et al. Bayesian Intermittent Demand Forecasting for Large Inventories , 2016, NIPS.
[44] Thomas A. Funkhouser,et al. Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[46] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.
[47] Shou-De Lin,et al. A Memory-Network Based Solution for Multivariate Time-Series Forecasting , 2018, ArXiv.
[48] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[49] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.
[50] Philip S. Yu,et al. Optimal multi-scale patterns in time series streams , 2006, SIGMOD Conference.
[51] Giuseppe Carlo Calafiore,et al. Log-Sum-Exp Neural Networks and Posynomial Models for Convex and Log-Log-Convex Data , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[52] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[53] Dennis Shasha,et al. StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.
[54] Benjamin Letham,et al. Forecasting at Scale , 2018, PeerJ Prepr..
[55] Alexander M. Rush,et al. Dilated Convolutions for Modeling Long-Distance Genomic Dependencies , 2017, bioRxiv.
[56] Garrison W. Cottrell,et al. A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction , 2017, IJCAI.
[57] Cyrus Shahabi,et al. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting , 2017, ICLR.
[58] Wenhu Chen,et al. Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting , 2019, NeurIPS.