FrameExit: Conditional Early Exiting for Efficient Video Recognition
暂无分享,去创建一个
Amirhossein Habibian | Amir Ghodrati | Babak Ehteshami Bejnordi | Amir Ghodrati | A. Habibian | B. E. Bejnordi | B. Bejnordi
[1] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[2] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[3] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Yi Yang,et al. Watching a Small Portion could be as Good as Watching All: Towards Efficient Video Classification , 2018, IJCAI.
[5] A. Oliva,et al. AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition , 2021, ICLR.
[6] Max Welling,et al. Batch-shaping for learning conditional channel gated networks , 2019, ICLR.
[7] Yang Li,et al. GaterNet: Dynamic Filter Selection in Convolutional Neural Network via a Dedicated Global Gating Network , 2018, ArXiv.
[8] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[9] Larry S. Davis,et al. A Coarse-to-Fine Framework for Resource Efficient Video Recognition , 2019, International Journal of Computer Vision.
[10] Luc Van Gool,et al. Large Scale Holistic Video Understanding , 2019, ECCV.
[11] Yutaka Satoh,et al. Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[12] Juan Carlos Niebles,et al. RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition , 2020, ECCV.
[13] Simone Calderara,et al. Conditional Channel Gated Networks for Task-Aware Continual Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] Christoph Feichtenhofer,et al. X3D: Expanding Architectures for Efficient Video Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Larry S. Davis,et al. AdaFrame: Adaptive Frame Selection for Fast Video Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Jordi Torres,et al. Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks , 2017, ICLR.
[18] Larry S. Davis,et al. BlockDrop: Dynamic Inference Paths in Residual Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[19] Quanfu Fan,et al. More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation , 2019, NeurIPS.
[20] Lorenzo Torresani,et al. SCSampler: Sampling Salient Clips From Video for Efficient Action Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[21] Xin Wang,et al. SkipNet: Learning Dynamic Routing in Convolutional Networks , 2017, ECCV.
[22] Shih-Fu Chang,et al. ConvNet Architecture Search for Spatiotemporal Feature Learning , 2017, ArXiv.
[23] Tao Mei,et al. Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[24] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[26] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[27] Tinne Tuytelaars,et al. Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Heng Wang,et al. Video Classification With Channel-Separated Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[29] Li Fei-Fei,et al. End-to-End Learning of Action Detection from Frame Glimpses in Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Cees Snoek,et al. Video Time: Properties, Encoders and Evaluation , 2018, BMVC.
[31] Thomas Brox,et al. ECO: Efficient Convolutional Network for Online Video Understanding , 2018, ECCV.
[32] Ahmet Gunduz,et al. Resource Efficient 3D Convolutional Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[33] Zhuowen Tu,et al. Deeply-Supervised Nets , 2014, AISTATS.
[34] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[35] Kilian Q. Weinberger,et al. Multi-Scale Dense Networks for Resource Efficient Image Classification , 2017, ICLR.
[36] Serge J. Belongie,et al. Convolutional Networks with Adaptive Inference Graphs , 2017, International Journal of Computer Vision.
[37] Mihir Jain,et al. TimeGate: Conditional Gating of Segments in Long-range Activities , 2020, ArXiv.
[38] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[39] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Wenhao Wu,et al. Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[41] Kate Saenko,et al. AR-Net: Adaptive Frame Resolution for Efficient Action Recognition , 2020, ECCV.
[42] Li Zhang,et al. Spatially Adaptive Computation Time for Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Tinne Tuytelaars,et al. Rank Pooling for Action Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[45] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[46] Limin Wang,et al. Dynamic Sampling Networks for Efficient Action Recognition in Videos , 2020, IEEE Transactions on Image Processing.
[47] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[49] Cheng-Zhong Xu,et al. Dynamic Channel Pruning: Feature Boosting and Suppression , 2018, ICLR.
[50] Jifeng Dai,et al. Resolution Adaptive Networks for Efficient Inference , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Yulan Guo,et al. Learning Sparse Masks for Efficient Image Super-Resolution , 2020, ArXiv.
[52] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[53] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[54] Tae-Hyun Oh,et al. Listen to Look: Action Recognition by Previewing Audio , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning For Video Understanding , 2017, ArXiv.