An expressive and compact representation of musical sound

We describe a representation of musical instrument sounds in which the sound is codified as the time histories of the parameters controlling a nonlinear physical model of the target musical instrument. As opposed to source-filter models often employed in speech, in which an assumed periodic source is filtered by a time varying acoustic transfer function, in our proposed method the nonlinear dynamical behavior of the source generator in interaction with an acoustic filter is directly simulated. The resulting autonomous oscillations of the combined system are controlled by a small number of slowly varying parameters. This representation has two advantages; first, it is extremely compact, requiring only 10's of bytes/sec of data, and second, it naturally captures variations of musical timbre and other nuances of live musical performance. We present preliminary results for recorded clarinet sounds in which the maximum likelihood estimation (MLE) method is used to infer the control parameter histories for an assumed target clarinet model.