Temporal Attention Mechanism with Conditional Inference for Large-Scale Multi-label Video Classification
暂无分享,去创建一个
Kyoung-Woon On | Yu-Jung Heo | Byoung-Tak Zhang | Hyun-Dong Lee | Jongseok Kim | Eun-Sol Kim | Seong-Ho Choi | Byoung-Tak Zhang | Y. Heo | Eun-Sol Kim | Kyoung-Woon On | Seongho Choi | Hyun-Dong Lee | Jongseok Kim
[1] Gunhee Kim,et al. Expressing an Image Stream with a Sequence of Natural Sentences , 2015, NIPS.
[2] Eyke Hüllermeier,et al. Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains , 2010, ICML.
[3] Josef Sivic,et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Qin Jin,et al. Multi-modal Dimensional Emotion Recognition using Recurrent Neural Networks , 2015, AVEC@ACM Multimedia.
[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[6] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[7] Hatice Gunes,et al. Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space , 2011, IEEE Transactions on Affective Computing.
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Louis-Philippe Morency,et al. Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[10] Johannes Fürnkranz,et al. Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification , 2017, NIPS.
[11] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[12] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[13] Apostol Natsev,et al. YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.
[14] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Yoshua Bengio,et al. Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.
[16] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[17] Geoff Holmes,et al. Classifier chains for multi-label classification , 2009, Machine Learning.
[18] Ivan Laptev,et al. Learnable pooling with Context Gating for video classification , 2017, ArXiv.
[19] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[20] Gunhee Kim,et al. Encoding Video and Label Priors for Multi-label Video Classification on YouTube-8M dataset , 2017, ArXiv.
[21] Andrew Zisserman,et al. All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.