Data Usage in MIR: History & Future Recommendations

The MIR community faces unique challenges in terms of data access, due in large part to country-specific copyright laws. As a result, there is an emerging divide in the MIR research community between labs that have access to music through large companies with abundant funds, and independent labs at smaller institutions who do not have such expansive access. This paper explores how independent researchers have worked to overcome limitations of access to music data without contributing to the crisis of reproducibility. Acknowledging that there is no single solution for every data access problem that smaller labs face, we propose a number of possibilities for how the MIR community can bridge the gap between advancements from large companies and those within academia. As MIR looks towards the next 20 years, democratizing and expanding access to MIR research and music data is critical. Future solutions could include a distributed MIREX system, an API designed for MIR researchers, and community-led advocacy to stakeholders.

[1]  J. Stephen Downie,et al.  Ten years of MIREX: reflections, challenges and opportunities , 2014, ISMIR 2014.

[2]  Ichiro Fujinaga,et al.  Music Structure Segmentation Algorithm Evaluation: Expanding on MIREX 2010 Analyses and Datasets , 2011, ISMIR.

[3]  J. Stephen Downie,et al.  The International Music Information Retrieval Systems Evaluation Laboratory: Governance, Access and Security , 2004, ISMIR.

[4]  Jeremy Pickens A Comparison of Language Modeling and Probabilistic Text Information Retrieval Approaches to Monophonic Music Retrieval , 2000, ISMIR.

[5]  Xavier Serra,et al.  Evaluation in Music Information Retrieval , 2013, Journal of Intelligent Information Systems.

[6]  Pierre Hanna,et al.  SATIN: a persistent musical database for music information retrieval and a supporting deep learning experiment on song instrumental classification , 2018, Multimedia Tools and Applications.

[7]  Bob L. Sturm Revisiting Priorities: Improving MIR Evaluation Practices , 2016, ISMIR.

[8]  Cable Green,et al.  Open Licensing and Open Education Licensing Policy , 2017 .

[9]  Ichiro Fujinaga,et al.  Optical Music Recognition System within a Large-Scale Digitization Project , 2000, ISMIR.

[10]  Matthias Mauch,et al.  MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research , 2014, ISMIR.

[11]  Mert Bay,et al.  The Music Information Retrieval Evaluation eXchange: Some Observations and Insights , 2010, Advances in Music Information Retrieval.

[12]  J. Stephen Downie,et al.  The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008 .

[13]  M. J. Chang,et al.  Making a Difference in Science Education , 2013, American educational research journal.

[14]  J. Stephen Downie The Music Information Retrieval Evaluation eXchange (MIREX): Community-Led Formal Evaluations , 2008 .

[15]  Ichiro Fujinaga,et al.  Overview of OMEN , 2006, ISMIR.

[16]  Xavier Serra,et al.  Multi-Label Music Genre Classification from Audio, Text and Images Using Deep Features , 2017, ISMIR.

[17]  Jong Wook Kim,et al.  Open-Source Practices for Music Signal Processing Research: Recommendations for Transparent, Sustainable, and Reproducible Audio Research , 2019, IEEE Signal Processing Magazine.

[18]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[19]  Julián Urbano,et al.  A Plan for Sustainable MIR Evaluation , 2016, ISMIR.

[20]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[21]  Björn W. Schuller,et al.  Identifying Emotions in Opera Singing: Implications of Adverse Acoustic Conditions , 2018, ISMIR.

[22]  Karen Simonyan,et al.  Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders , 2017, ICML.

[23]  Daniel P. W. Ellis,et al.  MIR_EVAL: A Transparent Implementation of Common MIR Metrics , 2014, ISMIR.