Piece Identification in Classical Piano Music Without Reference Scores

In this paper we describe an approach to identify the name of a piece of piano music, based on a short audio excerpt of a performance. Given only a description of the pieces in text format (i.e. no score information is provided), a reference database is automatically compiled by acquiring a number of audio representations (performances of the pieces) from internet sources. These are transcribed, preprocessed, and used to build a reference database via a robust symbolic fingerprinting algorithm, which in turn is used to identify new, incoming queries. The main challenge is the amount of noise that is introduced into the identification process by the music transcription algorithm and the automatic (but possibly suboptimal) choice of performances to represent a piece in the reference database. In a number of experiments we show how to improve the identification performance by increasing redundancy in the reference database and by using a preprocessing step to rate the reference performances regarding their suitability as a representation of the pieces in question. As the results show this approach leads to a robust system that is able to identify piano music with high accuracy -- without any need for data annotation or manual data preparation.

[1]  Geoffroy Peeters,et al.  AudioPrint: An efficient audio fingerprint system based on a novel cost-less synchronization scheme , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Florian Krebs,et al.  madmom: A New Python Audio and Music Signal Processing Library , 2016, ACM Multimedia.

[3]  Peter Grosche,et al.  Toward musically-motivated audio fingerprints , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Peter Grosche,et al.  Audio Content-Based Music Retrieval , 2012, Multimodal Music Processing.

[5]  Meinard Müller,et al.  Audio Matching via Chroma-Based Statistical Features , 2005, ISMIR.

[6]  Meinard Müller,et al.  Efficient Index-Based Audio Matching , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Avery Wang,et al.  An Industrial Strength Audio Search Algorithm , 2003, ISMIR.

[8]  Peter Grosche,et al.  Toward characteristic audio shingles for efficient cross-version music retrieval , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Shumeet Baluja,et al.  Waveprint: Efficient wavelet-based audio fingerprinting , 2008, Pattern Recognit..

[10]  Gerhard Widmer,et al.  Tempo- and Transposition-invariant Identification of Piece and Score Position , 2014, ISMIR.

[11]  Gerhard Widmer,et al.  Fast Identification of Piece and Score Position via Symbolic Fingerprinting , 2012, ISMIR.

[12]  Pedro Cano,et al.  A review of algorithms for audio fingerprinting , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[13]  Emilia Gómez,et al.  Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond , 2010, Advances in Music Information Retrieval.

[14]  Markus Schedl,et al.  Polyphonic piano note transcription with recurrent neural networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Marc Leman,et al.  Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification , 2014, ISMIR.

[16]  Michael A. Casey,et al.  Song Intersection by Approximate Nearest Neighbor Search , 2006, ISMIR.

[17]  Gerhard Widmer,et al.  Robust Quad-Based Audio Fingerprinting , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.