A Perceptual Prediction Framework for Self Supervised Event Segmentation
暂无分享,去创建一个
[1] Kevin Murphy,et al. What’s Cookin’? Interpreting Cooking Videos using Text, Speech and Vision , 2015, NAACL.
[2] Stephen Grossberg,et al. Adaptive Resonance Theory , 2010, Encyclopedia of Machine Learning.
[3] Subhashini Venugopalan,et al. Translating Videos to Natural Language Using Deep Recurrent Neural Networks , 2014, NAACL.
[4] Sudeep Sarkar,et al. Towards a Knowledge-Based Approach for Generating Video Descriptions , 2017, 2017 14th Conference on Computer and Robot Vision (CRV).
[5] Jeffrey M. Zacks,et al. Event Segmentation , 2007, Current directions in psychological science.
[6] Joo-Hwee Lim,et al. Predicting Visual Context for Unsupervised Event Segmentation in Continuous Photo-streams , 2018, ACM Multimedia.
[7] Juan Carlos Niebles,et al. Connectionist Temporal Modeling for Weakly Supervised Action Labeling , 2016, ECCV.
[8] Cordelia Schmid,et al. Weakly Supervised Action Labeling in Videos under Ordering Constraints , 2014, ECCV.
[9] Trevor Darrell,et al. Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[10] Jeffrey M. Zacks,et al. Perceiving, remembering, and communicating structure in events. , 2001, Journal of experimental psychology. General.
[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[12] Juergen Gall,et al. Temporal Action Detection Using a Statistical Language Model , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Ivan Laptev,et al. Unsupervised Learning from Narrated Instruction Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] C. V. Jawahar,et al. Unsupervised Learning of Deep Feature Representation for Clustering Egocentric Actions , 2017, IJCAI.
[15] Antonio Torralba,et al. Anticipating Visual Representations from Unlabeled Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] David B. Leake,et al. Modelling Unsupervised Event Segmentation: Learning Event Boundaries from Prediction Errors , 2017, CogSci.
[17] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[18] Juergen Gall,et al. Weakly Supervised Action Learning with RNN Based Fine-to-Coarse Modeling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Catherine Hanson,et al. Development of Schemata during Event Parsing: Neisser's Perceptual Cycle as a Recurrent Connectionist Network , 1996, Journal of Cognitive Neuroscience.
[20] Sinisa Todorovic,et al. Temporal Deformable Residual Networks for Action Segmentation in Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[21] Gregory D. Hager,et al. Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions , 2009, CVPR.
[22] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[23] Christopher Joseph Pal,et al. Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[24] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[25] T. Albright. Perceiving , 2015, Daedalus.
[26] Fadime Sener,et al. Unsupervised Learning and Segmentation of Complex Activities from Video , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[27] Stephen J. McKenna,et al. Combining embedded accelerometers with computer vision for recognizing food preparation activities , 2013, UbiComp.
[28] Chenliang Xu,et al. Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[29] Denis Fize,et al. Speed of processing in the human visual system , 1996, Nature.
[30] Yang Yang,et al. Bidirectional Long-Short Term Memory for Video Description , 2016, ACM Multimedia.
[31] Thomas Serre,et al. The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[32] Heng Tao Shen,et al. Attention-based LSTM with Semantic Consistency for Videos Captioning , 2016, ACM Multimedia.
[33] Gregory D. Hager,et al. Segmental Spatiotemporal CNNs for Fine-Grained Action Segmentation , 2016, ECCV.
[34] J M Fuster,et al. The prefrontal cortex and its relation to behavior. , 1991, Progress in brain research.
[35] Sudeep Sarkar,et al. Spatially Coherent Interpretations of Videos Using Pattern Theory , 2016, International Journal of Computer Vision.
[36] Jeffrey M. Zacks,et al. Event structure in perception and conception. , 2001, Psychological bulletin.
[37] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[38] Gregory D. Hager,et al. Temporal Convolutional Networks for Action Segmentation and Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] W. Kintsch,et al. Strategies of discourse comprehension , 1983 .
[40] Luc Van Gool,et al. Dynamic Filter Networks , 2016, NIPS.
[41] Sudeep Sarkar,et al. Exploiting Semantic Contextualization for Interpretation of Human Activity in Videos , 2017, ArXiv.
[42] Thomas Serre,et al. An end-to-end generative framework for video segmentation and recognition , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).