论文信息 - Accurate and Diverse Sampling of Sequences Based on a "Best of Many" Sample Objective

Accurate and Diverse Sampling of Sequences Based on a "Best of Many" Sample Objective

For autonomous agents to successfully operate in the real world, anticipation of future events and states of their environment is a key competence. This problem has been formalized as a sequence extrapolation problem, where a number of observations are used to predict the sequence into the future. Real-world scenarios demand a model of uncertainty of such predictions, as predictions become increasingly uncertain - in particular on long time horizons. While impressive results have been shown on point estimates, scenarios that induce multi-modal distributions over future sequences remain challenging. Our work addresses these challenges in a Gaussian Latent Variable model for sequence prediction. Our core contribution is a "Best of Many" sample objective that leads to more accurate and more diverse predictions that better capture the true variations in real-world sequence data. Beyond our analysis of improved model fit, our models also empirically outperform prior work on three diverse tasks ranging from traffic scenes to weather data.

[1] Michael Comenetz. Calculus: The Elements , 2002 .

[2] Pushmeet Kohli,et al. Multiple Choice Learning: Learning to Produce Multiple Structured Outputs , 2012, NIPS.

[3] Ruslan Salakhutdinov,et al. Learning Stochastic Feedforward Neural Networks , 2013, NIPS.

[4] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[5] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[6] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[7] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[8] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[9] Varun Ramakrishna,et al. Predicting Multiple Structured Visual Interpretations , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[11] Tapani Raiko,et al. Techniques for Learning Binary Stochastic Feedforward Neural Networks , 2014, ICLR.

[12] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[13] Andriy Mnih,et al. Variational Inference for Monte Carlo Objectives , 2016, ICML.

[14] Jiajun Wu,et al. Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks , 2016, NIPS.

[15] Ruslan Salakhutdinov,et al. Importance Weighted Autoencoders , 2015, ICLR.

[16] Silvio Savarese,et al. Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[17] Sergey Levine,et al. MuProp: Unbiased Backpropagation for Stochastic Neural Networks , 2015, ICLR.

[18] Michael Cogswell,et al. Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles , 2016, NIPS.

[19] Silvio Savarese,et al. Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[21] Dit-Yan Yeung,et al. Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model , 2017, NIPS.

[22] Yang Gao,et al. End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Jinwoo Shin,et al. Simplified Stochastic Feedforward Neural Networks , 2017, ArXiv.

[24] Philip H. S. Torr,et al. DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Katerina Fragkiadaki,et al. Motion Prediction Under Multimodality with Conditional Stochastic Networks , 2017, ArXiv.

[26] Peter V. Gehler,et al. A Generative Model of People in Clothing , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27] David Acuna. Unsupervised modeling of the movement of basketball players using a Deep Generative Model , 2022 .