A new multimodal deep-learning model to video scene segmentation
暂无分享,去创建一个
Rudinei Goularte | Tiago Henrique Trojahn | Rodrigo Mitsuo Kishi | R. Goularte | R. M. Kishi | T. H. Trojahn
[1] Gediminas Adomavicius,et al. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.
[2] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[3] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Yiannis Kompatsiaris,et al. Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features , 2011, IEEE Transactions on Circuits and Systems for Video Technology.
[5] Cordelia Schmid,et al. A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Xi Wang,et al. Multi-Stream Multi-Class Fusion of Deep Networks for Video Classification , 2016, ACM Multimedia.
[7] Marcel Worring,et al. Systematic evaluation of logical story unit segmentation , 2002, IEEE Trans. Multim..
[8] Alan Hanjalic,et al. Automated high-level movie segmentation for advanced video-retrieval systems , 1999, IEEE Trans. Circuits Syst. Video Technol..
[9] Nicu Sebe,et al. Personalization in multimedia retrieval: A survey , 2010, Multimedia Tools and Applications.
[10] Rita Cucchiara,et al. Measuring Scene Detection Performance , 2015, IbPRIA.
[11] Rudinei Goularte,et al. An Evaluation of Readily Usable Automatic Video Shot Segmentation Techniques , 2016, WebMedia.
[12] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[13] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Nikolas P. Galatsanos,et al. Scene Detection in Videos Using Shot Clustering and Sequence Alignment , 2009, IEEE Transactions on Multimedia.
[16] Alfred Kobsa,et al. Personalized Digital Television: Targeting Programs to Individual Viewers (Human-Computer Interaction Series, 6) , 2004 .
[17] Zachary Chase Lipton. A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.
[18] Maozhu Jin,et al. Study on a New Video Scene Segmentation Algorithm , 2015 .
[19] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[20] Rita Cucchiara,et al. Recognizing and Presenting the Storytelling Video Structure With Deep Multimodal Networks , 2016, IEEE Transactions on Multimedia.
[21] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[22] John Zimmerman,et al. Media Augmentation and Personalization Through Multimedia Processing and Information Extraction , 2004, Personalized Digital Television.
[23] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[24] Rudinei Goularte,et al. Shot-HR: a video shot representation method based on visual features , 2015, SAC.
[25] Thomas Wiatowski,et al. A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction , 2015, IEEE Transactions on Information Theory.
[26] A. Gupta,et al. Applications of MFCC and Vector Quantization in speaker recognition , 2013, 2013 International Conference on Intelligent Systems and Signal Processing (ISSP).
[27] Gertjan J. Burghouts,et al. Performance evaluation of local colour invariants , 2009, Comput. Vis. Image Underst..
[28] Rita Cucchiara,et al. A Deep Siamese Network for Scene Detection in Broadcast Videos , 2015, ACM Multimedia.
[29] Mohan S. Kankanhalli,et al. Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.
[30] Argyris Kalogeratos,et al. Movie segmentation into scenes and chapters using locally weighted bag of visual words , 2009, CIVR '09.
[31] László Böszörményi,et al. State-of-the-art and future challenges in video scene detection: a survey , 2013, Multimedia Systems.
[32] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[33] Cordelia Schmid,et al. A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..
[34] Boon-Lock Yeo,et al. Video query: Research directions , 1998, IBM J. Res. Dev..
[35] Wei Liu,et al. A Bag-of-Tones Model with MFCC Features for Musical Genre Classification , 2013, ADMA.
[36] Yiannis Kompatsiaris,et al. Differential Edit Distance: A Metric for Scene Segmentation Evaluation , 2012, IEEE Transactions on Circuits and Systems for Video Technology.