Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis
暂无分享,去创建一个
Ruslan Salakhutdinov | Louis-Philippe Morency | Yao-Hung Hubert Tsai | Martin Q. Ma | Muqiao Yang | R. Salakhutdinov | Louis-Philippe Morency | Muqiao Yang
[1] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[2] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[3] Karl J. Friston,et al. A multimodal language region in the ventral visual pathway , 1998, Nature.
[4] King-Sun Fu,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.
[6] G. Hommel,et al. Confidence interval or p-value?: part 4 of a series on evaluation of scientific publications. , 2009, Deutsches Arzteblatt international.
[7] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[8] J. Ranstam. Why the P-value culture is bad and confidence intervals a better alternative. , 2012, Osteoarthritis and cartilage.
[9] S. Scott,et al. When voices get emotional: A corpus of nonverbal vocalizations for research on emotion processing , 2013, Behavior research methods.
[10] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[11] John Kane,et al. COVAREP — A collaborative voice analysis repository for speech technologies , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[13] Oliver G. B. Garrod,et al. Dynamic Facial Expressions of Emotion Transmit an Evolving Hierarchy of Signals over Time , 2014, Current Biology.
[14] R. Tibshirani,et al. Generalized Additive Models , 1986 .
[15] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[17] Angeliki Lazaridou,et al. Combining Language and Vision with a Multimodal Skip-gram Model , 2015, NAACL.
[18] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[19] Apostol Natsev,et al. YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.
[20] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[21] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[22] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[23] Erik Cambria,et al. Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph , 2018, ACL.
[24] Firoj Alam,et al. CrisisMMD: Multimodal Twitter Datasets from Natural Disasters , 2018, ICWSM.
[25] Homayoon S. M. Beigi,et al. Multi-Modal Emotion recognition on IEMOCAP Dataset using Deep Learning , 2018, ArXiv.
[26] Geoffrey E. Hinton,et al. Matrix capsules with EM routing , 2018, ICLR.
[27] Le Song,et al. Learning to Explain: An Information-Theoretic Perspective on Model Interpretation , 2018, ICML.
[28] Louis-Philippe Morency,et al. Efficient Low-rank Multimodal Fusion With Modality-Specific Factors , 2018, ACL.
[29] S. R. Livingstone,et al. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English , 2018, PloS one.
[30] Jennifer Williams,et al. Recognizing Emotions in Video Using Multimodal DNN Feature Fusion , 2018 .
[31] Ruslan Salakhutdinov,et al. Learning Factorized Multimodal Representations , 2018, ICLR.
[32] Eric Granger,et al. Multimodal Fusion with Deep Neural Networks for Audio-Video Emotion Recognition , 2019, ArXiv.
[33] Louis-Philippe Morency,et al. Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors , 2018, AAAI.
[34] Louis-Philippe Morency,et al. Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[35] Ruslan Salakhutdinov,et al. Multimodal Transformer for Unaligned Multimodal Language Sequences , 2019, ACL.
[36] Chuang Gan,et al. The Neuro-Symbolic Concept Learner: Interpreting Scenes Words and Sentences from Natural Supervision , 2019, ICLR.