Multimodal speaker clustering in full length movies
暂无分享,去创建一个
Anastasios Tefas | Ioannis Pitas | Nikos Nikolaidis | Ioannis Kapsouras | Geoffroy Peeters | Laurent Benaroya | L. Benaroya | I. Pitas | G. Peeters | N. Nikolaidis | A. Tefas | I. Kapsouras | Geoffroy Peeters
[1] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.
[2] Matti Pietikäinen,et al. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions , 1994, Proceedings of 12th International Conference on Pattern Recognition.
[3] J. Calic,et al. A Survey on Multimodal Video Representation for Semantic Retrieval , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".
[4] Nicu Sebe,et al. Event Oriented Dictionary Learning for Complex Event Detection , 2015, IEEE Transactions on Image Processing.
[5] Anastasios Tefas,et al. Facial image clustering in stereoscopic videos using double spectral analysis , 2015, Signal Process. Image Commun..
[6] Cordelia Schmid,et al. Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.
[7] Chuohao Yeo,et al. Multi-modal speaker diarization of real-world meetings using compressed-domain video features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] Hervé Bourlard,et al. Using audio and visual cues for speaker diarisation initialisation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Stefanos Zafeiriou,et al. Robust Discriminative Response Map Fitting with Constrained Local Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[10] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.
[11] Philippe Joly,et al. Audiovisual diarization of people in video content , 2012, Multimedia Tools and Applications.
[12] Marcel Worring,et al. Multimedia Event-Based Video Indexing: A Review of the State-of-the-art , 2005 .
[13] Nicu Sebe,et al. Multimodal Human Computer Interaction: A Survey , 2005, ICCV-HCI.
[14] S. Chen,et al. Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .
[15] Alexandros Iosifidis,et al. On the kernel Extreme Learning Machine speedup , 2015, Pattern Recognit. Lett..
[16] Alexandros Iosifidis,et al. On the kernel Extreme Learning Machine classifier , 2015, Pattern Recognit. Lett..
[17] Anastasios Tefas,et al. Facial image clustering in stereo videos using local binary patterns and double spectral analysis , 2014, 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).
[18] Lei Xie,et al. Audio-visual human recognition using semi-supervised spectral learning and hidden Markov models , 2009, J. Vis. Lang. Comput..
[19] Anastasios Tefas,et al. Stereo object tracking with fusion of texture, color and disparity information , 2014, Signal Process. Image Commun..
[20] Ioannis Pitas,et al. Appearance based object tracking in stereo sequences , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Alexandros Iosifidis,et al. Visual voice activity detection based on spatiotemporal information and bag of words , 2015, 2015 IEEE International Conference on Image Processing (ICIP).
[22] Eshed Ohn-Bar,et al. Joint Angles Similiarities and HOG 2 for Action Recognition , 2013 .
[23] Ming Zhao,et al. Audiovisual celebrity recognition in unconstrained web videos , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[24] Gwenn Englebienne,et al. Multimodal Speaker Diarization , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[25] Khairuddin Omar,et al. An enhanced face detection method using skin color and back-Prodagation neural network , 2011 .
[26] Chuohao Yeo,et al. Visual speaker localization aided by acoustic models , 2009, MM '09.
[27] Manolis I. A. Lourakis,et al. Tracking of Human Hands and Faces through Probabilistic Fusion of Multiple Visual Cues , 2008, ICVS.
[28] Slim Essid,et al. A Multimodal Approach to Speaker Diarization on TV Talk-Shows , 2013, IEEE Transactions on Multimedia.
[29] Václav Hlavác,et al. Detector of Facial Landmarks Learned by the Structured Output SVM , 2012, VISAPP.
[30] Subramanian Ramanathan,et al. On the relationship between head pose, social attention and personality prediction for unstructured and dynamic group interactions , 2013, ICMI '13.
[31] Nicu Sebe,et al. Analyzing Free-standing Conversational Groups: A Multimodal Approach , 2015, ACM Multimedia.
[32] Radu Horaud,et al. Audio-Visual Clustering for 3D Speaker Localization , 2008, MLMI.
[33] Mohan M. Trivedi,et al. Joint Angles Similarities and HOG2 for Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[34] Ioannis Pitas,et al. A monocular system for person tracking: Implementation and testing , 2008, Journal on Multimodal User Interfaces.