Multiple Viewpoints Modeling of Tabla Sequences

We describe a system that attempts to predict the continuation of a symbolically encoded tabla composition at each time step using a variable-length n-gram model. Using cross-entropy as a measure of model fit, the best model attained an entropy rate of 0.780 in a cross-validation experiment, showing that symbolic tabla compositions can be effectively encoded using such a model. The choice of smoothing algorithm, which determines how information from different-order models is combined, is found to be an important factor in the models performance. We extend the basic n-gram model by adding viewpoints, other streams of information that can be used to improve predictive performance. First, we show that adding a short-term model, built on the current composition and not the entire corpus, leads to substantial improvements. Additional experiments were conducted with derived types, representations derived from the basic data type (stroke names), and cross-types, which model dependencies between parameters, such as duration and stroke name. For this database, such extensions improved performance only marginally, although this may have been due to the low entropy rate attained by the basic model.

[1]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[2]  François Pachet,et al.  "The way it Sounds": timbre models for analysis and retrieval of music signals , 2005, IEEE Transactions on Multimedia.

[3]  Charles Ames,et al.  The Markov Process as a Compositional Model: A Survey and Tutorial , 2017 .

[4]  María Herrojo Ruiz,et al.  Unsupervised statistical learning underpins computational, behavioural, and neural manifestations of musical expectation , 2010, NeuroImage.

[5]  Malcolm Slaney,et al.  Automatic Chord Recognition from Audio Using a HMM with Supervised Learning , 2006, ISMIR.

[6]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[7]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[8]  Parag Chordia Representation and Automatic Transcription of Solo Tabla Music , 2006 .

[9]  C. Stevens,et al.  Sweet Anticipation: Music and the Psychology of Expectation, by David Huron . Cambridge, Massachusetts: MIT Press, 2006 , 2007 .

[10]  Gaël Richard,et al.  Supervised and Unsupervised Sequence Modelling for Drum Transcription , 2007, ISMIR.

[11]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[12]  W. Teahan,et al.  Experiments on the zero frequency problem , 1995, Proceedings DCC '95 Data Compression Conference.

[13]  James Kippen,et al.  MODELLING MUSIC WITH GRAMMARS: FORMAL LANGUAGE REPRESENTATION IN THE BOL PROCESSOR , 1992 .

[14]  Geraint A. Wiggins,et al.  Methods for Combining Statistical Models of Music , 2004, CMMR.

[15]  Ian H. Witten,et al.  Multiple viewpoint systems for music prediction , 1995 .