Composer Recognition Based on 2D-Filtered Piano-Rolls

We propose a method for music classification based on the use of convolutional models on symbolic pitch–time representations (i.e. piano-rolls) which we apply to composer recognition. An excerpt of a piece to be classified is first sampled to a 2D pitch–time representation which is then subjected to various transformations, including convolution with predefined filters (Morlet or Gaussian) and classified by means of support vector machines. We combine classifiers based on different pitch representations (MIDI and morphetic pitch) and different filter types and configurations. The method does not require parsing of the music into separate voices, or extraction of any other predefined features prior to processing; instead it is based on the analysis of texture in a 2D pitch–time representation. We show that filtering significantly improves recognition and that the method proves robust to encoding, transposition and amount of information. On discriminating between Haydn and Mozart string quartet movements, our best classifier reaches state-of-the-art performance in leave-one-out cross validation.

[1]  D. Deutsch 6 – Grouping Mechanisms in Music , 2013 .

[2]  H. Bülthoff,et al.  Merging the senses into a robust percept , 2004, Trends in Cognitive Sciences.

[3]  Bernard Manderick,et al.  String Quartet Classification with Monophonic Models , 2010, ISMIR.

[4]  Jean-Pierre Antoine,et al.  Image analysis with two-dimensional continuous wavelet transform , 1993, Signal Process..

[5]  Jyh-Shing Roger Jang,et al.  Combining Visual and Acoustic Features for Music Genre Classification , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[6]  S Marcelja,et al.  Mathematical description of the responses of simple cortical cells. , 1980, Journal of the Optical Society of America.

[7]  A. Bregman,et al.  Demonstrations of auditory scene analysis : the perceptual organization of sound , 1995 .

[8]  Robert H. Anderson,et al.  The String Quartets , 1976 .

[9]  David Meredith,et al.  The ps13 pitch spelling algorithm , 2006 .

[10]  William Herlands,et al.  A Machine Learning Approach to Musically Meaningful Homogeneous Style Classification , 2014, AAAI.

[11]  Mireille Besson,et al.  Visually Induced Auditory Expectancy in Music Reading: A Behavioral and Electrophysiological Study , 2005, Journal of Cognitive Neuroscience.

[12]  Bob L. Sturm A Simple Method to Determine if a Music Information Retrieval System is a “Horse” , 2014, IEEE Transactions on Multimedia.

[13]  Nello Cristianini,et al.  Advances in Kernel Methods - Support Vector Learning , 1999 .

[14]  R. Jackendoff,et al.  A Generative Theory of Tonal Music , 1985 .

[15]  M. West,et al.  THE BABYLONIAN MUSICAL NOTATION AND THE HURRIAN MELODIC TEXTS , 1994 .

[16]  Tillman Weyde,et al.  An approach to melodic segmentation and classification based on filtering with the Haar-wavelet , 2013 .

[17]  J. Daugman Two-dimensional spectral analysis of cortical receptive field profiles , 1980, Vision Research.

[18]  Yuxiao Hu,et al.  Learning a Spatially Smooth Subspace for Face Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  J. Gallant,et al.  Identifying natural images from human brain activity , 2008, Nature.

[21]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[22]  E. Backer,et al.  Musical style recognition - a quantitative approach , 2004 .

[23]  Luiz Eduardo Soares de Oliveira,et al.  Music genre classification using LBP textural features , 2012, Signal Process..

[24]  Shuichi Sakamoto,et al.  Auditory Motion Information Drives Visual Motion Perception , 2011, PloS one.

[25]  José Manuel Iñesta Quereda,et al.  Modeling Musical Style with Language Models for Composer Recognition , 2013, IbPRIA.

[26]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[27]  Robert O. Gjerdingen,et al.  The psychology of music , 2002 .

[28]  Alain Rakotomamonjy,et al.  Automatic Feature Learning for Spatio-Spectral Image Classification With Sparse SVM , 2014, IEEE Transactions on Geoscience and Remote Sensing.