Polyphonic monotimbral music transcription using dynamic networks

The automatic extraction of the notes that were played in a digital musical signal (automatic music transcription) is an open problem. A number of techniques have been applied to solve it without concluding results. The monotimbral polyphonic version of the problem is posed here: a single instrument has been played and more than one note can sound at the same time. This work tries to approach it through the identification of the pattern of a given instrument in the frequency domain. This is achieved using time-delay neural networks that are fed with the band-grouped spectrogram of a polyphonic monotimbral music recording. The use of a learning scheme based on examples like neural networks permits our system to avoid the use of an auditory model to approach this problem. A number of issues have to be faced to have a robust and powerful system, but promising results using synthesized instruments are presented.

[1]  Anssi Klapuri,et al.  AUTOMATIC TRANSCRIPTION OF MUSIC , 2003 .

[2]  D.R. Hush,et al.  Progress in supervised neural networks , 1993, IEEE Signal Processing Magazine.

[3]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[4]  Geoffrey E. Hinton,et al.  A time-delay neural network architecture for isolated word recognition , 1990, Neural Networks.

[5]  Matija Marolt Transcription of polyphonic piano music with neural networks , 2000, 2000 10th Mediterranean Electrotechnical Conference. Information Technology and Electrotechnology for the Mediterranean Countries. Proceedings. MeleCon 2000 (Cat. No.00CH37099).

[6]  Thomas G. Dietterich,et al.  Readings in Machine Learning , 1991 .

[7]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[8]  Keith D. Martin,et al.  A Blackboard System for Automatic Transcription of Simple Polyphonic Music , 1996 .

[9]  Judith C. Brown,et al.  Musical frequency tracking using the methods of conventional and , 1991 .

[10]  Matti Karjalainen,et al.  A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..

[11]  Louis P. DiPalma,et al.  Music and Connectionism , 1991 .

[12]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[13]  James Anderson Moorer,et al.  On the segmentation and analysis of continuous musical sound by digital computer , 1975 .

[14]  Roland Wilson,et al.  Note recognition in polyphonic music using neural networks , 1993 .

[15]  W. Hess Algorithms and devices for pitch determination of speech signals. , 1982, Phonetica.