Rule-Based Expressive Modifications of Tempo in Polyphonic Audio Recordings

This paper describes a few aspects of a system for expressive, rule-based modifications of audio recordings regarding tempo, dynamics and articulation. The input audio signal is first aligned with a score containing extra information on how to modify a performance. The signal is then transformed into the time-frequency domain. Each played tone is identified using partial tracking and the score information. Articulation and dynamics are changed by modifying the length and content of the partial tracks. The focus here is on the tempo modification which is done using a combination of time frequency techniques and phase reconstruction. Preliminary results indicate that the accuracy of the tempo modification is in average 8.2 mswhen comparing Inter Onset Intervals in the resulting signal with the desired ones. Possible applications of such a system are in music pedagogy, basic perception research as well as interactive music systems.

[1]  Jean Laroche,et al.  Improved phase vocoder time-scale modification of audio , 1999, IEEE Trans. Speech Audio Process..

[2]  A. Wilgus,et al.  High quality time-scale modification for speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Jordi Bonada,et al.  Automatic technique in frequency domain for near-lossless time-scale modification of audio , 2000, ICMC.

[4]  Jordi Bonada,et al.  RHYTHMIC EXPRESSIVENESS TRANSFORMATIONS OF AUDIO RECORDINGS: SWING MODIFICATIONS , 2003 .

[5]  Jordi Bonada,et al.  Groovator - An Implementation of Real-Time Rhythm Transformations , 2006 .

[6]  Mathieu Lagrange,et al.  Tracking partials for the sinusoidal modeling of polyphonic sounds , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  Jordi Bonada,et al.  Content-based transformations , 2003 .

[8]  Gerhard Widmer,et al.  MATCH: A Music Alignment Tool Chest , 2005, ISMIR.

[9]  Lonce L. Wyse,et al.  Real-Time Iterative Spectrum Inversion with Look-Ahead , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[10]  Mathieu Lagrange,et al.  Enhancing the Tracking of Partials for the Sinusoidal Modeling of Polyphonic Sounds , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  A. Ferreira,et al.  Accurate and robust frequency estimation in the ODFT domain , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[12]  Julius O. Smith,et al.  Spectral modeling synthesis: A sound analysis/synthesis based on a deterministic plus stochastic decomposition , 1990 .

[13]  Anders Friberg,et al.  Expressive modifications of Musical audio recordings: Preliminary Results , 2007, ICMC.

[14]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[15]  Axel Röbel,et al.  Analysis/synthesis comparison , 2000 .

[16]  Deepen Sinha,et al.  Accurate Spectral Replacement , 2005 .

[17]  Jan O. Borchers,et al.  iSymphony: an adaptive interactive orchestral conducting system for digital audio and video streams , 2006, CHI EA '06.

[18]  P. Juslin,et al.  Cue Utilization in Communication of Emotion in Music Performance: Relating Performance to Perception Studies of Music Performance , 2022 .

[19]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[20]  Jae Lim,et al.  Signal estimation from modified short-time Fourier transform , 1984 .

[21]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Anssi Klapuri,et al.  Multipitch estimation and sound separation by the spectral smoothness principle , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[23]  Rafael Ramírez,et al.  Using Concatenative Synthesis for Expressive Performance in Jazz Saxophone , 2006, ICMC.

[24]  Anders Friberg,et al.  Home conducting - control the Overall Musical expression with gestures , 2005, ICMC.

[25]  Aníbal Ferreira,et al.  Combined spectral envelope normalization and subtraction of sinusoidal components in the ODFT and MDCT frequency domains , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[26]  David A. Luce,et al.  Dynamic Spectrum Changes of Orchestral Instruments , 1975 .

[27]  M. Grachten,et al.  Expressivity-aware tempo transformations of music performances using case based reasoning , 2006 .

[28]  J. L. Flanagan,et al.  PHASE VOCODER , 2008 .

[29]  J. Sundberg,et al.  Overview of the KTH rule system for musical performance. , 2006 .