论文信息 - Space-Time-Aware Multi-Resolution Video Enhancement

Space-Time-Aware Multi-Resolution Video Enhancement

We consider the problem of space-time super-resolution (ST-SR): increasing spatial resolution of video frames and simultaneously interpolating frames to increase the frame rate. Modern approaches handle these axes one at a time. In contrast, our proposed model called STARnet super-resolves jointly in space and time. This allows us to leverage mutually informative relationships between time and space: higher resolution can provide more detailed information about motion, and higher frame-rate can provide better pixel alignment. The components of our model that generate latent low- and high-resolution representations during ST-SR can be used to finetune a specialized mechanism for just spatial or just temporal super-resolution. Experimental results demonstrate that STARnet improves the performances of space-time, spatial, and temporal video super-resolution by substantial margins on publicly available datasets.

[1] Xiao Xiang Zhu,et al. Learning Spectral-Spatial-Temporal Features via a Recurrent Convolutional Neural Network for Change Detection in Multispectral Imagery , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[2] Ce Liu,et al. Exploring new representations and applications for motion analysis , 2009 .

[3] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Narendra Ahuja,et al. Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Wei Shen,et al. Spatial-temporal convolutional neural networks for anomaly detection and localization in crowded scenes , 2016, Signal Process. Image Commun..

[7] Gregory D. Hager,et al. Segmental Spatiotemporal CNNs for Fine-Grained Action Segmentation , 2016, ECCV.

[8] Matthew A. Brown,et al. Frame-Recurrent Video Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9] Michael J. Black,et al. Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Alexei A. Efros,et al. Learning Dense Correspondence via 3D-Guided Cycle Consistency , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Qiang Wang,et al. Fast Online Object Tracking and Segmentation: A Unifying Approach , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Subhashis Banerjee,et al. Space-Time Super-Resolution Using Graph-Cut Optimization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Jianbo Shi,et al. Zoom-In-To-Check: Boosting Video Interpolation via Instance-Level Discrimination , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] H. Ashida,et al. What makes space-time interactions in human vision asymmetrical? , 2015, Front. Psychol..

[15] Bernard Ghanem,et al. Finding Tiny Faces in the Wild with Generative Adversarial Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16] Xiaoou Tang,et al. Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Kyoung Mu Lee,et al. Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Feng Liu,et al. Context-Aware Synthesis for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20] Christian Ledig,et al. Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Michael J. Black,et al. Video Segmentation via Object Flow , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Maarten Speekenbrink,et al. Cross-dimensional magnitude interactions arise from memory interference , 2017, Cognitive Psychology.

[23] Carsten Rother,et al. Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Seoung Wug Oh,et al. Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[26] Bernard Ghanem,et al. Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Liang Wang,et al. Bidirectional Recurrent Convolutional Networks for Multi-Frame Super-Resolution , 2015, NIPS.

[28] Cordelia Schmid,et al. EpicFlow: Edge-preserving interpolation of correspondences for optical flow , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] H. Ashida,et al. Temporal Cognition Can Affect Spatial Cognition More Than Vice Versa: The Effect of Task-Related Stimulus Saliency. , 2019, Multisensory research.

[30] Tomer Peleg,et al. IM-Net for High Resolution Video Frame Interpolation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Zhiyong Gao,et al. MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32] Richard Szeliski,et al. A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[33] Jan Kautz,et al. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34] Dinesh Rajan,et al. Unified Blind Method for Multi-Image Super-Resolution and Single/Multi-Image Blur Deconvolution , 2013, IEEE Transactions on Image Processing.

[35] Yaron Caspi,et al. Under the supervision of , 2003 .

[36] Alan C. Bovik,et al. Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[37] Camilo C. Dorea,et al. Super Resolution for Multiview Images Using Depth Information , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[38] Jian Yang,et al. Image Super-Resolution via Deep Recursive Residual Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Hongdong Li,et al. Learning Image Matching by Simply Watching Video , 2016, ECCV.

[40] Yongqiang Zhang,et al. SOD-MTGAN: Small Object Detection via Multi-Task Generative Adversarial Network , 2018, ECCV.

[41] Renjie Liao,et al. Video Super-Resolution via Deep Draft-Ensemble Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42] Kyoung Mu Lee,et al. Deeply-Recursive Convolutional Network for Image Super-Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Yaron Caspi,et al. Increasing Space-Time Resolution in Video , 2002, ECCV.

[44] Gregory Shakhnarovich,et al. Deep Back-ProjectiNetworks for Single Image Super-Resolution , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45] Markus H. Gross,et al. PhaseNet for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46] Jiajun Wu,et al. Video Enhancement with Task-Oriented Flow , 2018, International Journal of Computer Vision.

[47] Radu Timofte,et al. 2018 PIRM Challenge on Perceptual Image Super-resolution , 2018, ArXiv.

[48] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[49] Thomas Brox,et al. Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[50] Xiaoyun Zhang,et al. Depth-Aware Video Frame Interpolation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51] Daniel Rueckert,et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Gregory Shakhnarovich,et al. Recurrent Back-Projection Network for Video Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Feng Liu,et al. Video Frame Interpolation via Adaptive Separable Convolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[54] Gregory Shakhnarovich,et al. Task-Driven Super Resolution: Object Detection in Low-resolution Images , 2018, ICONIP.

[55] Xiaoou Tang,et al. Video Frame Synthesis Using Deep Voxel Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[57] Chao Ren,et al. Space-time super-resolution with patch group cuts prior , 2015, Signal Process. Image Commun..

[58] Manoj Sharma,et al. Space-Time Super-Resolution Using Deep Learning Based Framework , 2017, PReMI.

[59] Oisin Mac Aodha,et al. Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[61] Gregory Shakhnarovich,et al. Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.