Efficient video coding in H.264/AVC by using audio-visual information
暂无分享,去创建一个
[1] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[2] Laurent Itti,et al. Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.
[3] Andrea Cavallaro,et al. Target Detection and Tracking With Heterogeneous Sensors , 2008, IEEE Journal of Selected Topics in Signal Processing.
[4] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[5] Chih-Wei Tang,et al. Spatiotemporal Visual Considerations for Video Coding , 2007, IEEE Transactions on Multimedia.
[6] Cheol Hoon Park,et al. Robust Audio-Visual Speech Recognition Based on Late Integration , 2008, IEEE Transactions on Multimedia.
[7] J. Driver,et al. Audiovisual links in endogenous covert spatial attention. , 1996, Journal of experimental psychology. Human perception and performance.
[8] Touradj Ebrahimi,et al. Semantic video analysis for adaptive content delivery and automatic description , 2005, IEEE Transactions on Circuits and Systems for Video Technology.
[9] C. Spence,et al. Attention and the crossmodal construction of space , 1998, Trends in Cognitive Sciences.
[10] Steven A. Hillyard,et al. Neural Substrates of Perceptual Enhancement by Cross-Modal Spatial Attention , 2003, Journal of Cognitive Neuroscience.
[11] Jean-Philippe Thiran,et al. Extraction of Audio Features Specific to Speech Production for Multimodal Speaker Detection , 2008, IEEE Transactions on Multimedia.
[12] Michael Elad,et al. Cross-Modal Localization via Sparsity , 2007, IEEE Transactions on Signal Processing.
[13] J. Driver,et al. Audiovisual links in exogenous covert spatial orienting , 1997, Perception & psychophysics.
[14] Christian Jutten,et al. Mixing Audiovisual Speech Processing and Blind Source Separation for the Extraction of Speech Signals From Convolutive Mixtures , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Paolo Napoletano,et al. Bayesian Integration of Face and Low-Level Cues for Foveated Video Coding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.
[16] B. Stein,et al. The Merging of the Senses , 1993 .
[17] Vladimir Pavlovic,et al. Toward multimodal human-computer interface , 1998, Proc. IEEE.
[18] Sugato Chakravarty,et al. Methodology for the subjective assessment of the quality of television pictures , 1995 .
[19] Touradj Ebrahimi,et al. Video coding based on audio-visual attention , 2009, 2009 IEEE International Conference on Multimedia and Expo.
[20] A. Murat Tekalp,et al. Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis , 2007, IEEE Transactions on Multimedia.
[21] Patrick Pérez,et al. Data fusion for visual tracking with particles , 2004, Proceedings of the IEEE.