论文信息 - Uncertainty Modelling in Deep Networks: Forecasting Short and Noisy Series - 字舞流文

Uncertainty Modelling in Deep Networks: Forecasting Short and Noisy Series

Deep Learning is a consolidated, state-of-the-art Machine Learning tool to fit a function when provided with large data sets of examples. However, in regression tasks, the straightforward application of Deep Learning models provides a point estimate of the target. In addition, the model does not take into account the uncertainty of a prediction. This represents a great limitation for tasks where communicating an erroneous prediction carries a risk. In this paper we tackle a real-world problem of forecasting impending financial expenses and incomings of customers, while displaying predictable monetary amounts on a mobile app. In this context, we investigate if we would obtain an advantage by applying Deep Learning models with a Heteroscedastic model of the variance of a network's output. Experimentally, we achieve a higher accuracy than non-trivial baselines. More importantly, we introduce a mechanism to discard low-confidence predictions, which means that they will not be visible to users. This should help enhance the user experience of our product.

Jordi Vitrià | José A. Rodríguez-Serrano | Roberto Maestre | Axel Brando | Mauricio Ciprian | Jordi Vitrià | Axel Brando | Roberto Maestre | M. Ciprian

[1] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.

[2] S. Wood. Generalized Additive Models: An Introduction with R , 2006 .

[3] Jürgen Schmidhuber,et al. Learning to forget: continual prediction with LSTM , 1999 .

[4] Lars Schmidt-Thieme,et al. Bank Card Usage Prediction Exploiting Geolocation Information , 2016, ArXiv.

[5] C. Bishop. Mixture density networks , 1994 .

[6] A. Kiureghian,et al. Aleatory or epistemic? Does it matter? , 2009 .

[7] R. Tibshirani,et al. Generalized Additive Models , 1991 .

[8] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[9] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[10] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.

[11] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[12] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[13] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[14] Teemu Mutanen,et al. Customer churn prediction - a case study in retail banking , 2010, Data Mining for Business Applications.

[15] S. Wood,et al. Coverage Properties of Confidence Intervals for Generalized Additive Model Components , 2012 .

[16] Gaurav Singh,et al. Predicting Branch Visits and Credit Card Up-selling using Temporal Banking Data , 2016, 1607.06123.

[17] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[18] Rob J Hyndman,et al. Another look at measures of forecast accuracy , 2006 .

[19] Alex Kendall,et al. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[20] Carl E. Rasmussen,et al. A Practical Monte Carlo Implementation of Bayesian Learning , 1995, NIPS.

[21] John Scott Bridle,et al. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[22] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[23] Ryan P. Adams,et al. Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.