暂无分享,去创建一个
[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[2] Erik Strumbelj,et al. Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.
[3] Andrew Zisserman,et al. Deep Insights into Convolutional Networks for Video Recognition , 2019, International Journal of Computer Vision.
[4] Dima Damen,et al. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.
[5] Avanti Shrikumar,et al. Learning Important Features Through Propagating Activation Differences , 2017, ICML.
[6] Luc Van Gool,et al. Large Scale Holistic Video Understanding , 2019, ECCV.
[7] Richard P. Wildes,et al. Spatiotemporal Residual Networks for Video Action Recognition , 2016, NIPS.
[8] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[9] Shuicheng Yan,et al. Multi-Fiber Networks for Video Recognition , 2018, ECCV.
[10] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[11] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[12] L. Shapley. A Value for n-person Games , 1988 .
[13] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Dima Damen,et al. Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[15] H. Young. Monotonic solutions of cooperative games , 1985 .
[16] Apostol Natsev,et al. YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.
[17] John Folkesson,et al. Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks , 2020, ACCV.
[18] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.
[19] Alexandros Stergiou,et al. Class Feature Pyramids for Video Explanation , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[20] Pascal Sturmfels,et al. Visualizing the Impact of Feature Attribution Baselines , 2020 .
[21] Yoichi Sato,et al. A Comprehensive Study on Visual Explanations for Spatio-temporal Networks , 2020, ArXiv.
[22] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[23] Vineeth N. Balasubramanian,et al. Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[24] Mukund Sundararajan,et al. The many Shapley values for model explanation , 2019, ICML.
[25] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[26] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[27] Remco C. Veltkamp,et al. Saliency Tubes: Visual Explanations for Spatio-Temporal Convolutions , 2019, 2019 IEEE International Conference on Image Processing (ICIP).
[28] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[29] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Zhe L. Lin,et al. Top-Down Neural Attention by Excitation Backprop , 2016, International Journal of Computer Vision.
[31] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[32] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.
[33] Donghyun Kim,et al. Excitation Backprop for RNNs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[34] Jonathan Tompson,et al. Temporal Reasoning in Videos Using Convolutional Gated Recurrent Units , 2018, CVPR Workshops.
[35] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification , 2017, ECCV.
[36] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[37] Been Kim,et al. Sanity Checks for Saliency Maps , 2018, NeurIPS.
[38] Erik Strumbelj,et al. Explaining instance classifications with interactions of subsets of feature values , 2009, Data Knowl. Eng..
[39] Cordelia Schmid,et al. AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[40] Arnold W. M. Smeulders,et al. Timeception for Complex Action Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[41] L. Shapley,et al. Values of Non-Atomic Games , 1974 .
[42] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[43] Andrea Vedaldi,et al. Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[44] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Erik Strumbelj,et al. An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..
[46] Andrew Zisserman,et al. Video Action Transformer Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Klaus-Robert Müller,et al. Interpretable human action recognition in compressed domain , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] Juan Carlos Niebles,et al. What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[49] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[50] Andrea Vedaldi,et al. Understanding Deep Networks via Extremal Perturbations and Smooth Masks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[51] Ingo Bax,et al. Evaluating visual "common sense" using fine-grained classification and captioning tasks , 2018, ICLR.
[52] Kate Saenko,et al. RISE: Randomized Input Sampling for Explanation of Black-box Models , 2018, BMVC.
[53] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[54] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).