A distance model for rhythms

Modeling long-term dependencies in time series has proved very difficult to achieve with traditional machine learning methods. This problem occurs when considering music data. In this paper, we introduce a model for rhythms based on the distributions of distances between subsequences. A specific implementation of the model when considering Hamming distances over a simple rhythm representation is described. The proposed model consistently outperforms a standard Hidden Markov Model in terms of conditional prediction accuracy on two different music databases.

[1]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[2]  Jürgen Schmidhuber,et al.  Finding temporal structure in music: blues improvisation with LSTM recurrent networks , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[3]  Simon J. Godsill,et al.  Sequential Inference of Rhythmic Structure in Musical Audio , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[7]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8]  Ali Taylan Cemgil,et al.  Tempo tracking and rhythm quantization by sequential Monte Carlo , 2001, NIPS.

[9]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[10]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[11]  S. Handel Listening As Introduction to the Perception of Auditory Events , 1989 .

[12]  Shlomo Dubnov,et al.  Using Machine-Learning Methods for Musical Style Modeling , 2003, Computer.

[13]  S. Handel,et al.  Listening: An Introduction to the Perception of Auditory Events , 1993 .

[14]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[15]  Simon Dixon,et al.  Evaluation of the Audio Beat Tracking System BeatRoot , 2007 .

[16]  François Pachet,et al.  The Continuator: Musical Interaction With Style , 2003, ICMC.

[17]  Douglas L. T. Rohde,et al.  Methods for Binary Multidimensional Scaling , 2002, Neural Computation.

[18]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .