论文信息 - Isolated guitar transcription using a deep belief network

Isolated guitar transcription using a deep belief network

Music transcription involves the transformation of an audio recording to common music notation, colloquially referred to as sheet music. Manually transcribing audio recordings is a difficult and time-consuming process, even for experienced musicians. In response, several algorithms have been proposed to automatically analyze and transcribe the notes sounding in an audio recording; however, these algorithms are often general-purpose, attempting to process any number of instruments producing any number of notes sounding simultaneously. This paper presents a polyphonic transcription algorithm that is constrained to processing the audio output of a single instrument, specifically an acoustic guitar. The transcription system consists of a novel note pitch estimation algorithm that uses a deep belief network andmulti-label learning techniques to generate multiple pitch estimates for each analysis frame of the input audio signal. Using a compiled dataset of synthesized guitar recordings for evaluation, the algorithm described in this work results in an 11% increase in the f-measure of note transcriptions relative to Zhou et al.’s (2009) transcription algorithm in the literature. This paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription. Subjects Data Mining and Machine Learning, Data Science

Gregory Burlet | Abram Hindle | Abram Hindle | Gregory Burlet

[1] Lei Tang,et al. Large scale multi-label classification via metalabeler , 2009, WWW '09.

[2] J. Beauchamp,et al. Fundamental frequency estimation of musical signals using a two‐way mismatch procedure , 1994 .

[3] Tillman Weyde,et al. Explicit Duration Hidden Markov Models for Multiple-Instrument Polyphonic Music Transcription , 2013, ISMIR.

[4] Tania Lombrozo. Computational Modeling of Chord Fingering for String Instruments , 2005 .

[5] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[6] Keith D. Martin,et al. A Blackboard System for Automatic Transcription of Simple Polyphonic Music , 1996 .

[7] Walter D. Potter,et al. An Evolved Neural Network/HC Hybrid for Tablature Creation in GA-based Guitar Arranging , 2006, ICMC.

[8] Gregory D. Burlet. Guitar Tablature Transcription using a Deep Belief Network , 2015 .

[9] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[10] Eric Singer,et al. Proceedings of the 2003 Conference on New Interfaces for Musical Expression (NIME-03), Montreal, Canada LEMUR GuitarBot: MIDI Robotic String Instrument , 2022 .

[11] Mark B. Sandler,et al. A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[12] Anssi Klapuri,et al. Automatic Transcription of Guitar Chords and Fingering From Audio , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[13] George Tzanetakis,et al. MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[14] Paul E. Utgoff,et al. Many-Layered Learning , 2002, Neural Computation.

[15] Malcolm D. Macleod,et al. The Automated Music Transcription Problem , 2004 .

[16] Juhan Nam,et al. A Classification-Based Polyphonic Piano Transcription Approach Using Learned Feature Representations , 2011, ISMIR.

[17] M.P. Ryynanen,et al. Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[18] Axel Röbel,et al. Multiple Fundamental Frequency Estimation and Polyphony Inference of Polyphonic Music Signals , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[19] Anssi Klapuri,et al. Automatic Music Transcription: Breaking the Glass Ceiling , 2012, ISMIR.

[20] Jeffrey G. Andrews,et al. Mode Switching for the Multi-Antenna Broadcast Channel Based on Delay and Channel Quantization , 2008, EURASIP J. Adv. Signal Process..

[21] Peter F. Driessen,et al. Path Difference Learning for Guitar Fingering Problem , 2004, ICMC.

[22] Vincenzo Lombardo,et al. Computational Model of Chord Fingering , 2005 .

[23] Nicolas Boulanger-Lewandowski. Modeling High-Dimensional Audio Sequences with Recurrent Neural Networks , 2014 .

[24] Hank Heijink,et al. On the Complexity of Classical Guitar Playing: Functional Adaptations to Task Constraints , 2002, Journal of motor behavior.

[25] Anssi Klapuri,et al. Multiple Fundamental Frequency Estimation by Summing Harmonic Amplitudes , 2006, ISMIR.

[26] Christopher Raphael,et al. Automatic Transcription of Piano Music , 2002, ISMIR.

[27] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[28] Daniel P. W. Ellis,et al. IMPROVING GENERALIZATION FOR POLYPHONIC PIANO TRANSCRIPTION , 2007 .

[29] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[30] Walter D. Potter,et al. A Genetic Algorithm for the Automatic Generation of Playable Guitar Tablature , 2005, ICMC.

[31] Graham E. Poliner,et al. Improving Generalization for Classification-Based Polyphonic Piano Transcription , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[32] Giovanni Costantini,et al. Event based transcription system for polyphonic piano music , 2009, Signal Process..

[33] Geoffrey E. Hinton. Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[34] Anssi Klapuri,et al. Automatic music transcription: challenges and future directions , 2013, Journal of Intelligent Information Systems.

[35] Yoshua Bengio,et al. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.

[36] Joshua D. Reiss,et al. A REAL-TIME POLYPHONIC MUSIC TRANSCRIPTION SYSTEM , 2008 .

[37] Gregory Burlet,et al. Robotaba Guitar Tablature Transcription Framework , 2013, ISMIR.

[38] Guillaume Lemaitre,et al. Real-time Polyphonic Music Transcription with Non-negative Matrix Factorization and Beta-divergence , 2010, ISMIR.

[39] Yann LeCun,et al. Moving Beyond Feature Design: Deep Architectures and Automatic Feature Learning in Music Informatics , 2012, ISMIR.

[40] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[41] S. Dixon. ONSET DETECTION REVISITED , 2006 .

[42] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[43] Joshua D. Reiss,et al. A Computationally Efficient Method for Polyphonic Pitch Estimation , 2009, EURASIP J. Adv. Signal Process..

[44] Yann LeCun,et al. Feature learning and deep architectures: new directions for music informatics , 2013, Journal of Intelligent Information Systems.

[45] Juan Pablo Bello,et al. From music audio to chord tablature: Teaching deep convolutional networks toplay guitar , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46] Simon Dixon,et al. An End-to-End Neural Network for Polyphonic Piano Music Transcription , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[47] Matija Marolt,et al. A connectionist approach to automatic transcription of polyphonic piano music , 2004, IEEE Transactions on Multimedia.

[48] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[49] Min-Ling Zhang,et al. A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[50] A.P. Klapuri,et al. A perceptually motivated multiple-F0 estimation method , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[51] James Anderson Moorer,et al. On the segmentation and analysis of continuous musical sound by digital computer , 1975 .

[52] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[53] Anssi Klapuri,et al. Automatic Music Transcription as We Know it Today , 2004 .