Image2song: Song Retrieval via Bridging Image Content and Lyric Words
暂无分享,去创建一个
[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[2] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[3] Michael S. Bernstein,et al. Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Menno van Zaanen,et al. Automatic Mood Classification Using TF*IDF Based on Lyrics , 2010, ISMIR.
[5] Alexei A. Efros,et al. Discovering object categories in image collections , 2005 .
[6] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[7] Xuelong Li,et al. Visual music and musical vision , 2008, Neurocomputing.
[8] Sanja Fidler,et al. Song From PI: A Musically Plausible Network for Pop Music Generation , 2016, ICLR.
[9] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[10] Peter Knees,et al. A music search engine built upon audio-based and web-based similarity measures , 2007, SIGIR.
[11] Yanjun Qi,et al. Polynomial Semantic Indexing , 2009, NIPS.
[12] D. Signorini,et al. Neural networks , 1995, The Lancet.
[13] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[14] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.
[15] Andreas F. Ehmann,et al. Lyric Text Mining in Music Mood Classification , 2009, ISMIR.
[16] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Òscar Celma,et al. QueryBag: Using Different Sources For Querying Large Music Collections , 2009 .
[18] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[19] Hendrik P. A. Lensch,et al. Auto-Illustrating Poems and Songs with Style , 2016, ACCV.
[20] Masataka Goto,et al. Music Thumbnailer: Visualizing Musical Pieces in Thumbnail Images Based on Acoustic Features , 2008, ISMIR.
[21] Samy Bengio,et al. Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.
[22] Honglak Lee,et al. Improved Multimodal Deep Learning with Variation of Information , 2014, NIPS.
[23] Martha Larson,et al. When music makes a scene , 2013, International Journal of Multimedia Information Retrieval.
[24] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[25] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[26] Sidney S. Simon,et al. Merging of the Senses , 2008, Front. Neurosci..
[27] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[28] Bowen Zhou,et al. LSTM-based Deep Learning Models for non-factoid answer selection , 2015, ArXiv.
[29] Jeffrey J. Scott,et al. MUSIC EMOTION RECOGNITION: A STATE OF THE ART REVIEW , 2010 .
[30] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[31] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Yu Zheng,et al. Retrieving Web Images to Enrich Music Representation , 2007, 2007 IEEE International Conference on Multimedia and Expo.
[33] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Xirong Li,et al. Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction , 2016 .
[35] Wei-Ying Ma,et al. Automated Music Video Generation using WEB Image Resource , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[36] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.
[37] Chunhua Shen,et al. What Value Do Explicit High Level Concepts Have in Vision to Language Problems? , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Òscar Celma,et al. Search Sounds: An audio crawler focused on weblogs , 2006, ISMIR.
[39] Tao Jin,et al. Automatic Generation of Music Slide Show Using Personal Photos , 2008, 2008 Tenth IEEE International Symposium on Multimedia.
[40] Markus Schedl,et al. Music Information Retrieval: Recent Developments and Applications , 2014, Found. Trends Inf. Retr..
[41] Xirong Li,et al. Word2VisualVec: Cross-Media Retrieval by Visual Feature Prediction , 2016, ArXiv.
[42] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[43] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[44] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[45] Xuelong Li,et al. Multimodal Learning via Exploring Deep Semantic Similarity , 2016, ACM Multimedia.
[46] Ruslan Salakhutdinov,et al. Multimodal Neural Language Models , 2014, ICML.
[47] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.