Predicting audio-visual salient events based on visual, audio and text modalities for movie summarization
暂无分享,去创建一个
Petros Maragos | Athanasios Katsamanis | Petros Koutras | Athanasia Zlatintsi | Alexandros Potamianos | Elias Iosif | P. Maragos | A. Potamianos | Athanasios Katsamanis | Athanasia Zlatintsi | Petros Koutras | E. Iosif | A. Zlatintsi
[1] Petros Maragos,et al. A saliency-based approach to audio event detection and summarization , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).
[2] Hugo Fastl,et al. Psychoacoustics: Facts and Models , 1990 .
[3] Lie Lu,et al. A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..
[4] H. M. Teager,et al. Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract , 1990 .
[5] Alfredo Restrepo,et al. Localized measurement of emergent image frequencies by Gabor wavelets , 1992, IEEE Trans. Inf. Theory.
[6] Shrikanth S. Narayanan,et al. Distributional Semantic Models for Affective Text Analysis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[7] Petros Maragos,et al. On the Effects of Filterbank Design and Energy Computation on Robust Speech Recognition , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Hugo Fastl,et al. Psychoacoustics Facts and Models. 2nd updated edition , 1999 .
[9] Yael Pritch,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008 1 Non-Chronological Video , 2022 .
[10] Shrikanth S. Narayanan,et al. Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.
[11] M. Bradley,et al. Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .
[12] Petros Maragos,et al. Advances on action recognition in videos using an interest point detector based on multiband spatio-temporal energies , 2014, 2014 IEEE International Conference on Image Processing (ICIP).
[13] M. Bradley,et al. Affective Normsfor English Words (ANEW): Stimuli, instruction manual and affective ratings (Tech Report C-1) , 1999 .
[14] Chong-Wah Ngo,et al. Summarizing Rushes Videos by Motion, Object, and Event Understanding , 2012, IEEE Transactions on Multimedia.
[15] Michael L. Littman,et al. Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus , 2002, ArXiv.
[16] Zellig S. Harris,et al. Distributional Structure , 1954 .
[17] S. David,et al. Auditory attention : focusing the searchlight on sound , 2007 .
[18] J. F. Kaiser,et al. On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[19] J. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.
[20] Petros Maragos,et al. Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention , 2013, IEEE Transactions on Multimedia.
[21] Alan Hanjalic,et al. An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis , 1999, IEEE Trans. Circuits Syst. Video Technol..
[22] Harry W. Agius,et al. Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..
[23] Yongjie Li,et al. A Color Constancy Model with Double-Opponency Mechanisms , 2013, 2013 IEEE International Conference on Computer Vision.
[24] D J Heeger,et al. Model for the extraction of image flow. , 1987, Journal of the Optical Society of America. A, Optics and image science.
[25] Petros Maragos,et al. Video event detection and summarization using audio, visual and text saliency , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[26] J. Daugman. Two-dimensional spectral analysis of cortical receptive field profiles , 1980, Vision Research.
[27] S. Shamma,et al. Interaction between Attention and Bottom-Up Saliency Mediates the Representation of Foreground and Background in an Auditory Scene , 2009, PLoS biology.
[28] Xin Liu,et al. Video summarization using singular value decomposition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).
[29] Yueting Zhuang,et al. Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).
[30] Christof Koch,et al. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .
[31] Michael T. Lippert,et al. Mechanisms for Allocating Auditory Attention: An Auditory Saliency Map , 2005, Current Biology.
[32] Xavier Binefa,et al. An EM algorithm for video summarization, generative model approach , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[33] Alan C. Bovik,et al. Multidimensional quasi-eigenfunction approximations and multicomponent AM-FM models , 2000, IEEE Trans. Image Process..
[34] Prashant Parikh. A Theory of Communication , 2010 .
[35] Preslav Nakov,et al. SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.
[36] Janusz Konrad,et al. Video Condensation by Ribbon Carving , 2009, IEEE Transactions on Image Processing.
[37] Zhu Liu,et al. Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..
[38] S Ullman,et al. Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.
[39] R. Plomp,et al. Tonal consonance and critical bandwidth. , 1965, The Journal of the Acoustical Society of America.
[40] A. Coutrot,et al. How saliency, faces, and sound influence gaze in dynamic social scenes. , 2014, Journal of vision.