Perspectives on predictive power of multimodal deep learning: surprises and future directions
暂无分享,去创建一个
Samy Bengio | Björn Schuller | Louis-Philippe Morency | Li Deng | Samy Bengio | L. Deng | Louis-Philippe Morency | Björn Schuller
[1] Louis-Philippe Morency,et al. Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2] George Trigeorgis,et al. Deep Canonical Time Warping for Simultaneous Alignment and Representation Learning of Sequences , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Stefanos Zafeiriou,et al. End2You - The Imperial Toolkit for Multimodal Profiling by End-to-End Learning , 2018, ArXiv.
[4] Li Deng,et al. Artificial Intelligence in the Rising Wave of Deep Learning: The Historical Path and Future Outlook [Perspectives] , 2018, IEEE Signal Processing Magazine.
[5] Li Deng,et al. Question-Answering with Grammatically-Interpretable Representations , 2017, AAAI.
[6] Li Deng,et al. Deep Learning for Image-to-Text Generation: A Technical Overview , 2017, IEEE Signal Processing Magazine.
[7] Alfred O. Hero,et al. Challenges and Open Problems in Signal Processing: Panel Discussion Summary from ICASSP 2017 [Panel and Forum] , 2017, IEEE Signal Processing Magazine.
[8] Björn W. Schuller,et al. From Hard to Soft: Towards more Human-like Emotion Recognition by Modelling the Perception Uncertainty , 2017, ACM Multimedia.
[9] Fabien Ringeval,et al. End-to-end learning for dimensional emotion recognition from physiological signals , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).
[10] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[11] Pascale Fung,et al. A Long Short-Term Memory Framework for Predicting Humor in Dialogues , 2016, NAACL.
[12] George Trigeorgis,et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.
[14] Jianfeng Gao,et al. Reasoning in Vector Space: An Exploratory Study of Question Answering , 2016, ICLR.
[15] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Eduardo Coutinho,et al. Cooperative Learning and its Application to Emotion Recognition from Speech , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[18] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[19] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[20] Yifan Gong,et al. Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Geoffrey Zweig,et al. Recent advances in deep learning for speech research at Microsoft , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] Xiao Li,et al. Machine Learning Paradigms for Speech Recognition: An Overview , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[23] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[24] Maja Pantic,et al. Classifying laughter and speech using audio-visual feature prediction , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[25] Björn W. Schuller,et al. Recognition of spontaneous conversational speech using long short-term memory phoneme predictions , 2010, INTERSPEECH.
[26] Hui Lin,et al. A study on multilingual acoustic modeling for large vocabulary ASR , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[27] Björn W. Schuller,et al. Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies , 2008, INTERSPEECH.
[28] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[29] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[30] Björn W. Schuller,et al. A Combined LSTM-RNN - HMM - Approach for Meeting Event Segmentation and Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.