Multimodal fusion using dynamic hybrid models
暂无分享,去创建一个
Mohamed R. Amer | Behjat Siddiquie | Ajay Divakaran | Harpreet S. Sawhney | Saad M. Khan | Saad M. Khan | H. Sawhney | Ajay Divakaran | Behjat Siddiquie
[1] Stephen J. Cox,et al. The challenge of multispeaker lip-reading , 2008, AVSP.
[2] Naonori Ueda,et al. Semisupervised Learning for a Hybrid Generative/Discriminative Classifier based on the Maximum Entropy Principle , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Sridha Sridharan,et al. Patch-Based Representation of Visual Speech , 2006 .
[4] Jean-Philippe Thiran,et al. Information Theoretic Feature Extraction for Audio-Visual Speech Recognition , 2009, IEEE Transactions on Signal Processing.
[5] Björn W. Schuller,et al. AVEC 2011-The First International Audio/Visual Emotion Challenge , 2011, ACII.
[6] Louis-Philippe Morency,et al. Modeling Latent Discriminative Dynamic of Multi-dimensional Affective Signals , 2011, ACII.
[7] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[8] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[9] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[10] Nebojsa Jojic,et al. Free Energy Score Spaces: Using Generative Information in Discriminative Classifiers , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] Trevor Darrell,et al. Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] Markus Kächele,et al. Multiple Classifier Systems for the Classification of Audio-Visual Emotional States , 2011, ACII.
[13] Behjat Siddiquie,et al. Affect analysis in natural human interaction using Joint Hidden Conditional Random Fields , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).
[14] Andrew McCallum,et al. High-Performance Semi-Supervised Learning using Discriminatively Constrained Generative Models , 2010, ICML.
[15] Geoffrey E. Hinton,et al. The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.
[16] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[17] Timothy F. Cootes,et al. Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[18] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[19] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[20] Geoffrey E. Hinton,et al. Two Distributed-State Models For Generating High-Dimensional Time Series , 2011, J. Mach. Learn. Res..
[21] Petros Maragos,et al. Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[23] Geoffrey E. Hinton,et al. Phone recognition using Restricted Boltzmann Machines , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[24] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[25] Matti Pietikäinen,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON MULTIMEDIA 1 Lipreading with Local Spatiotemporal Descriptors , 2022 .
[26] Max Welling,et al. Hidden-Unit Conditional Random Fields , 2011, AISTATS.
[27] Geoffrey E. Hinton,et al. Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.
[28] David J. Fleet,et al. Dynamical binary latent variable models for 3D human pose tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.