Comparing Probabilistic Models for Melodic Sequences

Modelling the real world complexity of music is a challenge for machine learning. We address the task of modeling melodic sequences fromthe same music genre. We perform a comparative analysis of two probabilistic models; a Dirichlet Variable Length Markov Model (Dirichlet-VMM) and a Time Convolutional Restricted Boltzmann Machine (TC-RBM). We show that the TC-RBM learns descriptive music features, such as underlying chords and typical melody transitions and dynamics. We assess the models for future prediction and compare their performance to a VMM, which is the current state of the art in melody generation. We show that both models perform significantly better than the VMM, with the Dirichlet-VMMmarginally outperforming the TC-RBM. Finally, we evaluate the short order statistics of the models, using the Kullback-Leibler divergence between test sequences and model samples, and show that our proposed methods match the statistics of the music genre significantly better than the VMM.

[1]  Shlomo Dubnov,et al.  Using Machine-Learning Methods for Musical Style Modeling , 2003, Computer.

[2]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[3]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[4]  Jeremy Pickens,et al.  Polyphonic music modeling with random fields , 2003, MULTIMEDIA '03.

[5]  Jean-François Paiement,et al.  Probabilistic models for music , 2008 .

[6]  Yee Whye Teh,et al.  A stochastic memoizer for sequence data , 2009, ICML '09.

[7]  Alan Smaill,et al.  Learning musical pitch structures with hierarchical hidden Markov models , 2005 .

[8]  Petri Toiviainen,et al.  MIDI toolbox : MATLAB tools for music research , 2004 .

[9]  Geoffrey E. Hinton,et al.  Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[10]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[11]  Geoffrey E. Hinton,et al.  Factored conditional restricted Boltzmann Machines for modeling motion style , 2009, ICML '09.

[12]  D. Eck,et al.  Learning Musical Structure Directly from Sequences of Music , 2006 .

[13]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[14]  Dana Ron,et al.  The Power of Amnesia , 1993, NIPS.

[15]  R. Jackendoff,et al.  A Generative Theory of Tonal Music , 1985 .

[16]  Geoffrey E. Hinton,et al.  Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[17]  Jürgen Schmidhuber,et al.  Learning the Long-Term Structure of the Blues , 2002, ICANN.

[18]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[19]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[20]  Mohammad Norouzi,et al.  Stacks of convolutional Restricted Boltzmann Machines for shift-invariant feature learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.