Streaming from MIDI Using Constraint Satisfaction Optimization and Sequence Alignment

We present a new system for separating streams in musical pieces encoded as MIDI files. Our approach is to: (1) divide the music under analysis into short segments, (2) analyze each segment using constraint satisfaction optimization, and (3) connect these analyses using a sequence alignment algorithm. Parameters for the system are learned automatically on a small training corpus and generalize reasonably well across a variety of pieces. We report performance results on 108 pieces of Baroque, Classical, and Romantic music: J.S. Bach’s two-part inventions (0.95 accuracy by the F-measure), three-part sinfonias (0.92), and fugues from the Well-Tempered Clavier, book I (0.93) and book II (0.92); and string quartets by Haydn (0.81) and Brahms (0.76).

[1]  G. Kramer Auditory Scene Analysis: The Perceptual Organization of Sound by Albert Bregman (review) , 2016 .

[2]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[3]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[4]  Holger H. Hoos,et al.  Voice Separation-A Local Optimisation Approach Voice Separation — A Local Optimisation Approach , 2002 .

[5]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[6]  Man Hon Wong,et al.  A stream segregation algorithm for polyphonic music databases , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[7]  Gerhard Widmer,et al.  Separating voices in MIDI , 2006, ISMIR.

[8]  Michael J. Denham,et al.  A Model of Auditory Streaming , 1995, NIPS.

[9]  D. Temperley The Cognition of Basic Musical Structures , 2001 .

[10]  R Meddis,et al.  Computer simulation of auditory stream segregation in alternating-tone sequences. , 1996, The Journal of the Acoustical Society of America.

[11]  Edward P. K. Tsang,et al.  Foundations of constraint satisfaction , 1993, Computation in cognitive science.

[12]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[13]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[14]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[15]  Robert O. Gjerdingen,et al.  Apparent Motion in Music , 1994 .

[16]  Alexandros Nanopoulos,et al.  VISA: The Voice Integration/Segregation Algorithm , 2007, ISMIR.

[17]  Elaine Chew,et al.  Separating Voices in Polyphonic Music: A Contig Mapping Approach , 2004, CMMR.

[18]  R. Jackendoff,et al.  A Generative Theory of Tonal Music , 1985 .

[19]  Paul E. Utgoff,et al.  VOISE: Learning to Segregate Voices in Explicit and Implicit Polyphony , 2005, ISMIR.

[20]  E. Cambouropoulos Voice And Stream: Perceptual And Computational Modeling Of Voice Separation , 2008 .