Beyond Expectation: Deep Joint Mean and Quantile Regression for Spatiotemporal Problems.

Spatiotemporal problems are ubiquitous and of vital importance in many research fields. Despite the potential already demonstrated by deep learning methods in modeling spatiotemporal data, typical approaches tend to focus solely on conditional expectations of the output variables being modeled. In this article, we propose a multioutput multiquantile deep learning approach for jointly modeling several conditional quantiles together with the conditional expectation as a way to provide a more complete “picture” of the predictive density in spatiotemporal problems. Using two large-scale data sets from the transportation domain, we empirically demonstrate that, by approaching the quantile regression problem from a multitask learning perspective, it is possible to solve the embarrassing quantile crossings problem while simultaneously significantly outperforming state-of-the-art quantile regression methods. Moreover, we show that jointly modeling the mean and several conditional quantiles not only provides a rich description about the predictive density that can capture heteroscedastic properties at a neglectable computational overhead but also leads to improved predictions of the conditional expectation due to the extra information and the regularization effect induced by the added quantiles.

[1]  Nikolay Laptev,et al.  Deep and Confident Prediction for Time Series at Uber , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[2]  H. Bondell,et al.  Noncrossing quantile regression curve estimation. , 2010, Biometrika.

[3]  Xuming He Quantile Curves without Crossing , 1997 .

[4]  Maxime Sangnier,et al.  Joint quantile regression in vector-valued RKHSs , 2016, NIPS.

[5]  Hang-Bong Kang,et al.  Prediction of crime occurrence from multi-modal data using deep learning , 2017, PloS one.

[6]  Saeid Nahavandi,et al.  Prediction Intervals to Account for Uncertainties in Travel Time Prediction , 2011, IEEE Transactions on Intelligent Transportation Systems.

[7]  Francisco C. Pereira,et al.  A Review of Heteroscedasticity Treatment with Gaussian Processes and Quantile Regression Meta-models , 2017 .

[8]  A. Harvey Estimating Regression Models with Multiplicative Heteroscedasticity , 1976 .

[9]  B. Silverman,et al.  Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .

[10]  Fei-Yue Wang,et al.  Traffic Flow Prediction With Big Data: A Deep Learning Approach , 2015, IEEE Transactions on Intelligent Transportation Systems.

[11]  Saeid Nahavandi,et al.  A genetic algorithm-based method for improving quality of travel time prediction intervals , 2011 .

[12]  Tianbao Yang,et al.  Hetero-ConvLSTM: A Deep Learning Approach to Traffic Accident Prediction on Heterogeneous Spatio-Temporal Data , 2018, KDD.

[13]  Francisco C. Pereira,et al.  Multi-output bus travel time prediction with convolutional LSTM neural network , 2019, Expert Syst. Appl..

[14]  Miguel Lázaro-Gredilla,et al.  Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.

[15]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18]  Amir F. Atiya,et al.  Comprehensive Review of Neural Network-Based Prediction Intervals and New Advances , 2011, IEEE Transactions on Neural Networks.

[19]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[20]  Naomi S. Altman,et al.  Quantile regression , 2019, Nature Methods.

[21]  Yi-Hsien Wang,et al.  Nonlinear neural network forecasting model for stock index option price: Hybrid GJR-GARCH approach , 2009, Expert Syst. Appl..

[22]  Alexander J. Smola,et al.  Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..

[23]  V. Chernozhukov,et al.  QUANTILE AND PROBABILITY CURVES WITHOUT CROSSING , 2007, 0704.3649.

[24]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[25]  Yu Zheng,et al.  Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction , 2016, AAAI.

[26]  Pei-Yi Hao,et al.  Pair- ${v}$ -SVR: A Novel and Efficient Pairing nu-Support Vector Regression Algorithm , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[27]  R. Koenker,et al.  Regression Quantiles , 2007 .

[28]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[29]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[30]  Paul H. C. Eilers,et al.  Simultaneous estimation of quantile curves using quantile sheets , 2013 .

[31]  Graham Currie,et al.  Prediction intervals to account for uncertainties in neural network predictions: Methodology and application in bus travel time prediction , 2011, Eng. Appl. Artif. Intell..

[32]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[33]  Yandong Yang,et al.  Power load probability density forecasting using Gaussian process quantile regression , 2017 .

[34]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[35]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.