Enriched Music Representations With Multiple Cross-Modal Contrastive Learning
暂无分享,去创建一个
Dmitry Bogdanov | Andres Ferraro | Konstantinos Drossos | Xavier Favory | Yuntae Kim | D. Bogdanov | Xavier Favory | Andrés Ferraro | K. Drossos | Yuntae Kim
[1] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[2] Xavier Serra,et al. Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Xavier Serra,et al. musicnn: Pre-trained convolutional neural networks for music audio tagging , 2019, ArXiv.
[4] Ce Liu,et al. Supervised Contrastive Learning , 2020, NeurIPS.
[5] Òscar Celma,et al. A new approach to evaluating novel recommendations , 2008, RecSys '08.
[6] Bob L. Sturm,et al. Deep Learning and Music Adversaries , 2015, IEEE Transactions on Multimedia.
[7] Hao-Yu Wu,et al. Classification is a Strong Baseline for Deep Metric Learning , 2018, BMVC.
[8] Jason Weston,et al. WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.
[9] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[10] Xavier Serra,et al. Multimodal Metric Learning for Tag-Based Music Retrieval , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[12] Xavier Serra,et al. Multimodal Deep Learning for Music Genre Classification , 2018, Trans. Int. Soc. Music. Inf. Retr..
[13] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Xavier Serra,et al. Tensorflow Audio Models in Essentia , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.
[16] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..
[17] Xavier Serra,et al. The MTG-Jamendo Dataset for Automatic Music Tagging , 2019, ICML 2019.
[18] Grigorios Tsoumakas,et al. On the Stratification of Multi-label Data , 2011, ECML/PKDD.
[19] Xavier Serra,et al. COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations , 2020, ArXiv.
[20] Juhan Nam,et al. Metric learning vs classification for disentangled music representation learning , 2020, ISMIR.
[21] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[22] Neil Zeghidour,et al. Contrastive Learning of General-Purpose Audio Representations , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Kilian Q. Weinberger,et al. Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.
[24] Jordi Torres,et al. Cross-modal Embeddings for Video and Audio Retrieval , 2018, ECCV Workshops.
[25] Xavier Serra,et al. Evaluation of CNN-based Automatic Music Tagging Models , 2020, ArXiv.
[26] Noel E. O'Connor,et al. Unsupervised Contrastive Learning of Sound Event Representations , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Minz Won,et al. Mood Classification Using Listening Data , 2020, ISMIR.
[28] Sanjoy Dasgupta,et al. Random projection trees and low dimensional manifolds , 2008, STOC.
[29] Alan F. Smeaton,et al. Contrastive Representation Learning: A Framework and Review , 2020, IEEE Access.
[30] Xavier Serra,et al. Melon Playlist Dataset: A Public Dataset for Audio-Based Playlist Generation and Music Tagging , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Juhan Nam,et al. Zero-shot Learning for Audio-based Music Classification and Tagging , 2019, ISMIR.
[32] Justin Salamon,et al. Look, Listen, and Learn More: Design Choices for Deep Audio Embeddings , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Xavier Serra,et al. How Low Can You Go? Reducing Frequency and Time Resolution in Current CNN Architectures for Music Auto-tagging , 2021, 2020 28th European Signal Processing Conference (EUSIPCO).