An Assessment of Learned Score Features for Modeling Expressive Dynamics in Music

The study of musical expression is an ongoing and increasingly data-intensive endeavor, in which machine learning techniques can play an important role. The purpose of this paper is to evaluate the utility of unsupervised feature learning in the context of modeling expressive dynamics, in particular note intensities of performed music. We use a note centric representation of musical contexts, which avoids shortcomings of existing musical representations. With that representation, we perform experiments in which learned features are used to predict note intensities. The experiments are done using a data set comprising professional performances of Chopin's complete piano repertoire. For feature learning we use Restricted Boltzmann machines, and contrast this with features learned using matrix decomposition methods. We evaluate the results both quantitatively and qualitatively, identifying salient learned features, and discussing their musical relevance.

[1]  Tara N. Sainath,et al.  Deep Belief Networks using discriminative features for phone recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Gerhard Widmer,et al.  Automatic Page Turning for Musicians via Real-Time Machine Listening , 2008, ECAI.

[3]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[4]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[5]  Yann LeCun,et al.  Moving Beyond Feature Design: Deep Architectures and Automatic Feature Learning in Music Informatics , 2012, ISMIR.

[6]  Alfred Binet,et al.  Recherches graphiques sur la musique , 1895 .

[7]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[8]  Gerhard Widmer,et al.  Discovering simple rules in complex data: A meta-learning algorithm and some surprising musical discoveries , 2003, Artif. Intell..

[9]  Gerhard Widmer,et al.  The Magaloff Project: An Interim Report , 2010 .

[10]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[11]  Feng Qianjin,et al.  Projected gradient methods for Non-negative Matrix Factorization based relevance feedback algorithm in medical image retrieval , 2011 .

[12]  J. Sundberg,et al.  On the anatomy of the retard: A study of timing in music , 1980 .

[13]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[14]  Gerhard Widmer,et al.  Linear Basis Models for Prediction and Analysis of Musical Expression , 2012 .

[15]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[16]  Amos J. Storkey,et al.  Comparing Probabilistic Models for Melodic Sequences , 2011, ECML/PKDD.

[17]  Rafael Ramírez,et al.  Modelling Expressive Performance: A Regression Tree Approach Based on Strongly Typed Genetic Programming , 2006, EvoWorkshops.

[18]  Thomas L. Rhea,et al.  Evolution of the Keyboard Interface: The Bosendorfer 290 SE Recording Piano and the Moog Multiply-Touch-Sensitive Keyboards , 1990 .

[19]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[20]  Robert O. Gjerdingen,et al.  The Psychology of Music , 1972 .

[21]  Gerhard Widmer,et al.  YQX Plays Chopin , 2009, AI Mag..

[22]  Maarten Grachten,et al.  Evolutionary Optimization of Music Performance Annotation , 2004, CMMR.

[23]  Roberto Bresin,et al.  Measurement and reproduction accuracy of computer-controlled grand pianos. , 2003, The Journal of the Acoustical Society of America.

[24]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[25]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[26]  Laurenz Wiskott,et al.  Slow feature analysis yields a rich repertoire of complex cell properties. , 2005, Journal of vision.

[27]  N. Todd The dynamics of dynamics: A model of musical expression , 1992 .

[28]  Christian Osendorfer,et al.  Music Similarity Estimation with the Mean-Covariance Restricted Boltzmann Machine , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[29]  J. Sundberg,et al.  Overview of the KTH rule system for musical performance. , 2006 .

[30]  Johan Sundberg,et al.  Threshold and Preference Quantities of Rules for Music Performance , 1991 .

[31]  V. Howard The Corded Shell: Reflections on Musical Expression , 1983 .

[32]  Geoffrey E. Hinton,et al.  Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[33]  Alan J Lockett and Risto Miikkulainen Temporal Convolution Machines for Sequence Learning , 2009 .