MusiCoder: A Universal Music-Acoustic Encoder Based on Transformers
暂无分享,去创建一个
Jia Guo | Yilun Zhao | Kejun Zhang | Xinda Wu | Yuqing Ye
[1] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[2] Dawn Song,et al. Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty , 2019, NeurIPS.
[3] Bob L. Sturm. The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use , 2013, ArXiv.
[4] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[5] Andrew M. Dai,et al. Music Transformer: Generating Music with Long-Term Structure , 2018, ICLR.
[6] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[7] Ludwig Schmidt,et al. Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.
[8] Yuzong Liu,et al. Deep Contextualized Acoustic Representations for Semi-Supervised Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[10] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[11] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[12] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[13] Benjamin Schrauwen,et al. Deep content-based music recommendation , 2013, NIPS.
[14] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[15] Shang-Wen Li,et al. Audio Albert: A Lite Bert for Self-Supervised Learning of Audio Representation , 2021, 2021 IEEE Spoken Language Technology Workshop (SLT).
[16] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[17] Xavier Serra,et al. End-to-end Learning for Music Audio Tagging at Scale , 2017, ISMIR.
[18] Yi-Hsuan Yang,et al. MediaEval 2019 Emotion and Theme Recognition task: A VQ-VAE Based Approach , 2019, MediaEval.
[19] Xavier Serra,et al. The MTG-Jamendo Dataset for Automatic Music Tagging , 2019, ICML 2019.
[20] Gerhard Widmer,et al. Emotion and Theme Recognition in Music with Frequency-Aware RF-Regularized CNNs , 2019, MediaEval.
[21] Yi-Hsuan Yang,et al. Pop Music Transformer: Generating Music with Rhythm and Harmony , 2020, ArXiv.
[22] Hui Zhang,et al. The PMEmo Dataset for Music Emotion Recognition , 2018, ICMR.
[23] Mark Sandler,et al. Transfer Learning for Music Classification and Regression Tasks , 2017, ISMIR.
[24] Hung-yi Lee,et al. Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[26] Johan A. K. Suykens,et al. Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.
[27] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.
[28] Guangsen Wang,et al. Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks , 2020, INTERSPEECH.
[29] Yan Liu,et al. Learning Music Emotion Primitives via Supervised Dynamic Clustering , 2016, ACM Multimedia.
[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[31] Xavier Bresson,et al. FMA: A Dataset for Music Analysis , 2016, ISMIR.
[32] Karen Livescu,et al. Unsupervised Pre-Training of Bidirectional Speech Encoders via Masked Reconstruction , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Maosong Sun,et al. ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.
[34] Climent Nadeu,et al. On Real-Time Mean-and-Variance Normalization of Speech Recognition Features , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[35] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[36] Kejun Zhang,et al. Web music emotion recognition based on higher effective gene expression programming , 2013, Neurocomputing.
[37] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[38] Olli Viikki,et al. Cepstral domain segmental feature vector normalization for noise robust speech recognition , 1998, Speech Commun..
[39] Yi-Hsuan Yang,et al. Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions , 2020, ACM Multimedia.
[40] Chun Chen,et al. Music recommendation by unified hypergraph: combining social media information and music content , 2010, ACM Multimedia.
[41] Ze-Nian Li,et al. Audio feature reduction and analysis for automatic music genre classification , 2014, 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC).
[42] Mark Goadrich,et al. The relationship between Precision-Recall and ROC curves , 2006, ICML.
[43] Maheshkumar H. Kolekar,et al. Music Genre Recognition Using Deep Neural Networks and Transfer Learning , 2018, INTERSPEECH.
[44] Marcos Aurélio Domingues,et al. Music4All: A New Music Database and Its Applications , 2020, 2020 International Conference on Systems, Signals and Image Processing (IWSSIP).
[45] Hongfu Liu,et al. Mind Band: A Crossmedia AI Music Composing Platform , 2019, ACM Multimedia.