M3: MultiModal Masking Applied to Sentiment Analysis
暂无分享,去创建一个
Georgios Paraskevopoulos | Alexandros Potamianos | Efthymios Georgiou | A. Potamianos | Georgios Paraskevopoulos | Efthymios Georgiou
[1] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[2] Yao-Hung Hubert Tsai,et al. Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis , 2020, EMNLP.
[3] Yingyu Liang,et al. Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis , 2019, AAAI.
[4] Lucia Specia,et al. Probing the Need for Visual Context in Multimodal Machine Translation , 2019, NAACL.
[5] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[6] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[7] Alexander Gelbukh,et al. DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation , 2019, EMNLP.
[8] Tae-Hyun Oh,et al. Learning to Localize Sound Source in Visual Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] John Kane,et al. COVAREP — A collaborative voice analysis repository for speech technologies , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[11] Zhongkai Sun,et al. Multi-modal Sentiment Analysis using Deep Canonical Correlation Analysis , 2019, INTERSPEECH.
[12] Shinsuke Shimojo,et al. Development of multisensory spatial integration and perception in humans. , 2006, Developmental science.
[13] Louis-Philippe Morency,et al. Integrating Multimodal Information in Large Pretrained Transformers , 2020, ACL.
[14] Louis-Philippe Morency,et al. Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors , 2018, AAAI.
[15] Du Tran,et al. What Makes Training Multi-Modal Classification Networks Hard? , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Georgios Paraskevopoulos,et al. Multimodal and Multiresolution Speech Recognition with Transformers , 2020, ACL.
[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[18] Erik Cambria,et al. Tensor Fusion Network for Multimodal Sentiment Analysis , 2017, EMNLP.
[19] Erik Cambria,et al. Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph , 2018, ACL.
[20] Alexandros Potamianos,et al. Deep Hierarchical Fusion with Application in Sentiment Analysis , 2019, INTERSPEECH.
[21] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[22] Ruslan Salakhutdinov,et al. Multimodal Transformer for Unaligned Multimodal Language Sequences , 2019, ACL.
[23] Ivan Marsic,et al. Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment , 2018, ACL.
[24] Christopher D. Chambers,et al. Current perspectives and methods in studying neural mechanisms of multisensory interactions , 2012, Neuroscience & Biobehavioral Reviews.
[25] Shamane Siriwardhana,et al. Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition , 2020, INTERSPEECH.
[26] Barnabás Póczos,et al. Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities , 2018, AAAI.
[27] Joon Son Chung,et al. ASR is All You Need: Cross-Modal Distillation for Lip Reading , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Louis-Philippe Morency,et al. Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.