论文信息 - Probabilistic model of two-dimensional rhythm tree structure representation for automatic transcription of polyphonic MIDI signals

Probabilistic model of two-dimensional rhythm tree structure representation for automatic transcription of polyphonic MIDI signals

This paper proposes a Bayesian approach for automatic music transcription of polyphonic MIDI signals based on generative modeling of onset occurrences of musical notes. Automatic music transcription involves two subproblems that are interdependent of each other: rhythm recognition and tempo estimation. When we listen to music, we are able to recognize its rhythm and tempo (or beat location) fairly easily even though there is ambiguity in determining the individual note values and tempo. This may be made possible through our empirical knowledge about rhythm patterns and tempo variations that possibly occur in music. To automate the process of recognizing the rhythm and tempo of music, we propose modeling the generative process of a MIDI signal of polyphonic music by combining the sub-process by which a musically natural tempo curve is generated and the sub-process by which a set of note onset positions is generated based on a 2-dimensional rhythm tree structure representation of music, and develop a parameter inference algorithm for the proposed model. We show some of the transcription results obtained with the present method.

Hirokazu Kameoka | Shigeki Sagayama | Masato Tsuchiya | Kazuki Ochiai

[1] Haruhiro Katayose,et al. A New Music Database Describing Deviation Information of Performance Expressions , 2008, ISMIR.

[2] Christopher Raphael,et al. A hybrid graphical model for rhythmic parsing , 2002, Artif. Intell..

[3] Hirokazu Kameoka,et al. Context-free 2D Tree Structure Model of Musical Notes for Bayesian Modeling of Polyphonic Spectrograms , 2012, ISMIR.

[4] Perry R. Cook,et al. Bayesian Nonparametric Matrix Factorization for Recorded Music , 2010, ICML.

[5] Peter Desain,et al. Quantization of musical time: a connectionist approach , 1989 .

[6] S. Harnad. Categorical Perception: The Groundwork of Cognition , 1990 .

[7] Hirokazu Kameoka,et al. Bayesian nonparametric music parser , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8] Shigeki Sagayama,et al. Rhythm and Tempo Analysis Toward Automatic Music Transcription , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9] Peter Desain,et al. Rhythm Quantization for Transcription , 2000, Computer Music Journal.

[10] Hirokazu Kameoka,et al. A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11] Ali Taylan Cemgil,et al. Robust Time-quantization for Music, from Performance to Score , 1999 .

[12] Dan Klein,et al. The Infinite PCFG Using Hierarchical Dirichlet Processes , 2007, EMNLP.