论文信息 - Piano music transcription modeling note temporal evolution

Piano music transcription modeling note temporal evolution

Automatic music transcription (AMT) is the process of converting an acoustic musical signal into a symbolic musical representation such as a MIDI piano roll, which contains the pitches, the onsets and offsets of the notes and, possibly, their dynamic and source (i.e., instrument). Existing algorithms for AMT commonly identify pitches and their saliences in each frame and then form notes in a post-processing stage, which applies a combination of thresholding, pruning and smoothing operations. Very few existing methods consider the note temporal evolution over multiple frames during the pitch identification stage. In this work we propose a note-based spectrogram factorization method that uses the entire temporal evolution of piano notes as a template dictionary. The method uses an artificial neural network to detect note onsets from the audio spectral flux. Next, it estimates the notes present in each audio segment between two successive onsets with a greedy search algorithm. Finally, the spectrogram of each segment is factorized using a discrete combination of note templates comprised of full note spectrograms of individual piano notes sampled at different dynamic levels. We also propose a new psychoacoustically informed measure for spectrogram similarity.

Zhiyao Duan | Andrea Cogliati

[1] Tuomas Virtanen,et al. Multichannel audio upmixing based on non-negative tensor factorization representation , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[2] Giovanni Costantini,et al. Event based transcription system for polyphonic piano music , 2009, Signal Process..

[3] Murray Campbell,et al. The Musician’s Guide to Acoustics , 1987 .

[4] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[5] Daniel P. W. Ellis,et al. Transcribing Multi-Instrument Polyphonic Music With Hierarchical Eigeninstruments , 2011, IEEE Journal of Selected Topics in Signal Processing.

[6] Roland Badeau,et al. Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[7] Hirokazu Kameoka,et al. A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8] Simon Dixon,et al. A Shift-Invariant Latent Variable Model for Automatic Music Transcription , 2012, Computer Music Journal.

[9] Emilio Molina,et al. Database of Piano Chords - An Engineering View of Harmony , 2013, Springer Briefs in Electrical and Computer Engineering.

[10] Bhiksha Raj,et al. A Probabilistic Latent Variable Model for Acoustic Modeling , 2006 .

[11] Tuomas Virtanen,et al. Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[12] Anssi Klapuri,et al. Automatic music transcription: challenges and future directions , 2013, Journal of Intelligent Information Systems.

[13] Matija Marolt. Transcription of polyphonic piano music with neural networks , 2000, 2000 10th Mediterranean Electrotechnical Conference. Information Technology and Electrotechnology for the Mediterranean Countries. Proceedings. MeleCon 2000 (Cat. No.00CH37099).

[14] Tom Barker,et al. Non-negative tensor factorisation of modulation spectrograms for monaural sound source separation , 2013, INTERSPEECH.

[15] Clive A. Greated,et al. The Musician's Guide to Acoustics , 1987 .

[16] Mark B. Sandler,et al. A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[17] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[18] Björn W. Schuller,et al. Universal Onset Detection with Bidirectional Long Short-Term Memory Neural Networks , 2010, ISMIR.

[19] Alenka Kavcic,et al. Neural Networks for Note Onset Detection in Piano Music , 2002 .