Computer Vision – ECCV 2018

Temporal action proposal generation is an important yet challenging problem, since temporal proposals with rich action content are indispensable for analysing real-world videos with long duration and high proportion irrelevant content. This problem requires methods not only generating proposals with precise temporal boundaries, but also retrieving proposals to cover truth action instances with high recall and high overlap using relatively fewer proposals. To address these difficulties, we introduce an effective proposal generation method, named Boundary-Sensitive Network (BSN), which adopts “local to global” fashion. Locally , BSN first locates temporal boundaries with high probabilities, then directly combines these boundaries as proposals. Globally , with Boundary-Sensitive Proposal feature, BSN retrieves proposals by evaluating the confidence of whether a proposal contains an action within its region. We conduct experiments on two challenging datasets: ActivityNet-1.3 and THUMOS14, where BSN outperforms other state-of-the-art temporal action proposal generation methods with high recall and high temporal precision. Finally, further experiments demonstrate that by combining existing action classifiers, our method significantly improves the state-of-the-art temporal action detection performance.

[1]  Shai Avidan,et al.  Non-local Image Dehazing , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Takeo Kanade,et al.  The measurement of highlights in color images , 1988, International Journal of Computer Vision.

[3]  Cordelia Schmid,et al.  Towards Weakly-Supervised Action Localization , 2016, ArXiv.

[4]  Katsushi Ikeuchi,et al.  Separating Reflection Components of Textured Surfaces Using a Single Image , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Hui-Liang Shen,et al.  Simple and efficient method for specularity removal in an image. , 2009, Applied optics.

[6]  Shree K. Nayar,et al.  Separation of Reflection Components Using Color and Polarization , 1997, International Journal of Computer Vision.

[7]  Rongrong Ji,et al.  Visual tracking via weakly supervised learning from multiple imperfect oracles , 2014, Pattern Recognit..

[8]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[9]  Sang Wook Lee,et al.  Detection of diffuse and specular interface reflections and inter-reflections by color image segmentation , 1996, International Journal of Computer Vision.

[10]  Takayuki Okatani,et al.  Separation of reflection components by sparse non-negative matrix factorization , 2014, Comput. Vis. Image Underst..

[11]  Ramesh Raskar,et al.  Removing photography artifacts using gradient projection and flash-exposure sampling , 2005, SIGGRAPH 2005.

[12]  Nanning Zheng,et al.  Saturation-preserving specular reflection separation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Mubarak Shah,et al.  Fast Zero-Shot Image Tagging , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Stephen Lin,et al.  Variational Specular Separation Using Color and Polarization , 2002, MVA.

[15]  Bowen Zhang,et al.  Real-Time Action Recognition with Enhanced Motion Vector CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Richard P. Wildes,et al.  Spacetime Forests with Complementary Features for Dynamic Scene Recognition , 2013, BMVC.

[17]  In-So Kweon,et al.  Fast Separation of Reflection Components using a Specularity-Invariant Image Representation , 2006, 2006 International Conference on Image Processing.

[18]  Cordelia Schmid,et al.  Human Action Localization with Sparse Spatial Supervision , 2017 .

[19]  Bernt Schiele,et al.  Zero-Shot Learning — The Good, the Bad and the Ugly , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Chenliang Xu,et al.  Weakly Supervised Actor-Action Segmentation via Robust Multi-task Ranking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jiandong Tian,et al.  Specular Reflection Separation With Color-Lines Constraint , 2017, IEEE Transactions on Image Processing.

[22]  Kai Fan,et al.  Zero-Shot Learning via Class-Conditioned Deep Generative Models , 2017, AAAI.

[23]  David J. Kriegman,et al.  Specularity Removal in Images and Videos: A PDE Approach , 2006, ECCV.

[24]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[25]  Zhi-Hua Zhou Multi-Instance Learning : A Survey , 2004 .

[26]  Hui-Liang Shen,et al.  Real-time highlight removal using intensity ratio. , 2013, Applied optics.

[27]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[28]  Stephen Lin,et al.  Diffuse-Specular Separation and Depth Recovery from Image Sequences , 2002, ECCV.

[29]  Bingbing Ni,et al.  Temporal Action Localization with Pyramid of Score Distribution Features , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Antoni B. Chan,et al.  Growing a bag of systems tree for fast and accurate classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Hongyu Zhao,et al.  Low-Rank Modeling and Its Applications in Image Analysis , 2014, ACM Comput. Surv..

[32]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[33]  Feng Liu,et al.  Sketch Me That Shoe , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Steven A. Shafer,et al.  Using color to separate reflection components , 1985 .

[35]  Li Fei-Fei,et al.  End-to-End Learning of Action Detection from Frame Glimpses in Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[37]  Dmitry Chetverikov,et al.  A Survey of Specularity Removal Methods , 2011, Comput. Graph. Forum.

[38]  Stephen Lin,et al.  Highlight removal by illumination-constrained inpainting , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[39]  In-So Kweon,et al.  Specular Reflection Separation Using Dark Channel Prior , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Ramesh Raskar,et al.  Specular reflection reduction with multi-flash imaging , 2004, Proceedings. 17th Brazilian Symposium on Computer Graphics and Image Processing.

[41]  Yunchao Wei,et al.  STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Limin Wang,et al.  Temporal Action Detection with Structured Segment Networks , 2017, International Journal of Computer Vision.

[43]  Larry S. Davis,et al.  Learning Structured Low-Rank Representations for Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Tong Lu,et al.  Temporal Action Localization by Structured Maximal Sums , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[46]  Kate Saenko,et al.  R-C3D: Region Convolutional 3D Network for Temporal Activity Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Katsushi Ikeuchi,et al.  Illumination chromaticity estimation using inverse-intensity chromaticity space , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[48]  Kun Zhou,et al.  Specular Highlight Removal in Facial Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Nenghai Yu,et al.  Non-negative low rank and sparse graph for semi-supervised learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Luc Van Gool,et al.  UntrimmedNets for Weakly Supervised Action Recognition and Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Katsushi Ikeuchi,et al.  Temporal-color space analysis of reflection , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Matthias Teschner,et al.  Analysis of 2D Color Spaces for Highlight Elimination in 3D Shape Reconstruction , 2007 .

[53]  Stephen Lin,et al.  Separation of Highlight Reflections on Textured Surfaces , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[54]  Bernt Schiele,et al.  Latent Embeddings for Zero-Shot Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[56]  Josef Kittler,et al.  Dynamic Texture Recognition Using Multiscale Binarized Statistical Image Features , 2014, IEEE Transactions on Multimedia.

[57]  Takahiro Okabe,et al.  Separating reflection components in images under multispectral and multidirectional light sources , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[58]  Fan Wang,et al.  Specularity removal: A global energy minimization approach based on polarization imaging , 2017, Comput. Vis. Image Underst..

[59]  Qionghai Dai,et al.  Fast and High Quality Highlight Removal From a Single Image , 2015, IEEE Transactions on Image Processing.

[60]  Guy Godin,et al.  Separation of diffuse and specular components of surface reflection by use of polarization and statistical analysis of images , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Stephen Lin,et al.  Separation of diffuse and specular reflection in color images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[62]  Narendra Ahuja,et al.  Real-Time Specular Highlight Removal Using Bilateral Filtering , 2010, ECCV.

[63]  Hsien-Che Lee,et al.  Modeling Light Reflection for Computer Color Vision , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[64]  Honggang Zhang,et al.  Chromaticity-based separation of reflection components in a single image , 2008, Pattern Recognit..

[65]  Yair Weiss,et al.  Deriving intrinsic images from image sequences , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.