FMA: A Dataset for Music Analysis

We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at this https URL

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Geoffroy Peeters,et al.  The extended ballroom dataset , 2016 .

[3]  Masataka Goto,et al.  A chorus section detection method for musical audio signals and its application to a music listening station , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Nicola Orio,et al.  A professionally annotated and enriched multimodal data set on popular music , 2013, MMSys.

[5]  Klaus Seyerlehner FUSING BLOCK-LEVEL FEATURES FOR MUSIC SIMILARITY ESTIMATION , 2010 .

[6]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[7]  George Tzanetakis,et al.  Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies , 2011, MM 2011.

[8]  Andreas Rauber,et al.  Capturing the Temporal Domain in Echonest Features for Improved Classification Effectiveness , 2012, Adaptive Multimedia Retrieval.

[9]  Arthur Flexer,et al.  A Closer Look on Artist Filters for Musical Genre Classification , 2007, ISMIR.

[10]  Gerhard Widmer,et al.  Evaluating Rhythmic descriptors for Musical Genre Classification , 2004 .

[11]  Katharina Morik,et al.  A Benchmark Dataset for Audio Classification and Clustering , 2005, ISMIR.

[12]  Geber Ramalho,et al.  Cross Task Study on MIREX Recent Results: An Index for Evolution Measurement and Some Stagnation Hypotheses , 2016, ISMIR.

[13]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[14]  Julián Urbano,et al.  A Plan for Sustainable MIR Evaluation , 2016, ISMIR.

[15]  Ichiro Fujinaga,et al.  A Large Publicly Accassible Prototype Audio Database for Music Research , 2006, ISMIR.

[16]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Benjamin Schrauwen,et al.  End-to-end learning for music audio , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Mert Bay,et al.  The Music Information Retrieval Evaluation eXchange: Some Observations and Insights , 2010, Advances in Music Information Retrieval.

[19]  Thierry Bertin-Mahieux,et al.  Automatic Tagging of Audio: The State-of-the-Art , 2011 .

[20]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[21]  Michael I. Mandel,et al.  Evaluation of Algorithms Using Games: The Case of Music Tagging , 2009, ISMIR.

[22]  Daniel P. W. Ellis,et al.  A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.

[23]  Bob L. Sturm The State of the Art Ten Years After a State of the Art: Future Research in Music Information Retrieval , 2013, ArXiv.

[24]  Aren Jansen,et al.  Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Daniel P. W. Ellis,et al.  Classifying Music Audio with Timbral and Chroma Features , 2007, ISMIR.

[26]  Roger B. Dannenberg,et al.  TagATune: A Game for Music and Sound Annotation , 2007, ISMIR.

[27]  P. van Kranenburg,et al.  International Society for Music Information Retrieval , 2014 .

[28]  Zhouyu Fu,et al.  A Survey of Audio-Based Music Classification and Annotation , 2011, IEEE Transactions on Multimedia.

[29]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30]  Bob L. Sturm A Survey of Evaluation in Music Genre Recognition , 2012, Adaptive Multimedia Retrieval.

[31]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[32]  Lie Lu,et al.  Music type classification by spectral contrast feature , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[33]  Andreas Rauber,et al.  Facilitating Comprehensive Benchmarking Experiments on the Million Song Dataset , 2012, ISMIR.

[34]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[35]  Youngmoo E. Kim,et al.  Exploring automatic music annotation with "acoustically-objective" tags , 2010, MIR '10.

[36]  Yann LeCun,et al.  Moving Beyond Feature Design: Deep Architectures and Automatic Feature Learning in Music Informatics , 2012, ISMIR.

[37]  Bob L. Sturm An analysis of the GTZAN music genre dataset , 2012, MIRUM '12.

[38]  N. Scaringella,et al.  Automatic genre classification of music content: a survey , 2006, IEEE Signal Process. Mag..

[39]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[40]  Klaus Seyerlehner,et al.  FRAME LEVEL AUDIO SIMILARITY - A CODEBOOK APPROACH , 2008 .

[41]  Hendrik Schreiber,et al.  Improving Genre Annotations for the Million Song Dataset , 2015, ISMIR.

[42]  C. Harte,et al.  Detecting harmonic change in musical audio , 2006, AMCMM '06.

[43]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[44]  Alessandro L. Koerich,et al.  The Latin Music Database , 2008, ISMIR.

[45]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[46]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[47]  Xavier Serra,et al.  AcousticBrainz: A Community Platform for Gathering Music Information Obtained from Audio , 2015, ISMIR.

[48]  Ichiro Fujinaga,et al.  Musical genre classification: Is it worth pursuing and how can it be improved? , 2006, ISMIR.

[49]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[50]  Gerhard Widmer,et al.  Improvements of Audio-Based Music Similarity and Genre Classificaton , 2005, ISMIR.