Timbre models of musical sounds - from the model of one sound to the model of one instrument

This work involves the analysis of musical instrument sounds, the creation of timbre models, the estimation of the parameters of the timbre models and the analysis of the timbre model parameters. The timbre models are found by studying the literature of auditory perception, and by studying the gestures of music performance. Some of the important results from this work are an improved fundamental frequency estimator, a new envelope analysis method, and simple intuitive models for the sound of musical instruments. Furthermore a model for the spectral envelope is introduced in this work. A new function, the brightness creation function, is introduced in the spectral envelope model. The timbre model is used to analyze the evolution of the different timbre parameters when the fundamental frequency is changed, but also for different intensity, tempo, or style. The main results from this analysis are that brightness rises with frequency, but nevertheless the fundamental has almost all amplitude for the high notes. The attack and release times generally fall with frequency. It was found that only brightness and amplitude are affected by a change in intensity, and only the sustain and release times are affected when the tempo is changed. The different timbre models are also used for the classification of the sounds in musical instrument classes with very good results. Finally, listening tests have been performed, which assessed that the best timbre model has an acceptable sound quality. Resumé Dette arbejder omhandler analyse af musikinstrumenter, dannelse af modeller af musikinstrumenters klangfarve, estimering af klangfarve model parametre og analyse af modelparametrene. Klangfarvemodellerne er fundet ved at gennemgå lydperceptorisk litteratur, og ved at studere musikudøvelse. Nogle vigtige resultater fra dette arbejde er en forbedret fundamental frekvens estimator, en ny envelope analysemetode, og simple intuitive modeller af musiklyd. Desuden er en model af den spektrale envelope udviklet. I den forbindelse er en ny funktion for syntese af lyd med en given ‘brightness’ udviklet. Klangfarvemodellen er brugt til at analysere udviklingen af de forskellige klangfarveattributter, når fundamentalfrekvensen ændres, men også for forskellige intensiteter, tempi og stil. De vigtigste konklusioner fra dette arbejde er, at ‘brightness’ stiger med frekvens; men fundamentalen har alligevel næsten al amplitude for de høje toner. ‘Attack’ og ‘release’ tiderne falder med frekvensen. Af intensitetsog tempoændringer fandtes, at kun ‘brightness’ og amplituden ændres når intensiteten ændres, og at kun ‘sustain’ og ‘release’ tiderne ændres når tempoet ændres. De forskellige klangfarvemodeller er også brugt til klassifikation af lyd i instrumentklasser med meget godt resultat. Lytteforsøg godtgjorde, at den bedste klangfarvemodel har en acceptabel lydkvalitet. Résumé Ce travail traite l’analyse des sons musicaux, la création des modèles de timbre, l’estimation des paramètres des modèles de timbre, ainsi que l’analyse des paramètres des modèles. Les modèles de timbre ont été trouvés dans la littérature de la perception auditive et en étudiant les gestes du musicien. Quelques résultats importants du travail présenté ici sont une estimation améliorée de la fréquence fondamentale. Une nouvelle méthode pour l’estimation des temps d’attaque et de relâchement a été developpée, ainsi que des modèles intuitifs de sons d’instrument de musique. Un nouveau modèle d’enveloppe spectrale a été défini, ainsi qu’une fonction qui donne un son avec la brillance indiquée. Les modèles de timbre sont utilisés pour l’analyse de l’évolution des paramètres des timbres en fonction de la fréquence fondamentale, de l’intensité, du tempo ou du style. Le résultat principal de cette analyse est que la brillance monte avec la fréquence, mais que la fondamentale a presque toute l’amplitude dans les aigüs. Les temps d’attaque et relâchement diminuent avec la fréquence fondamentale. Pour une variation de l’intensité, seul l’amplitude et la brillance sont affectées. Seuls les temps de maintien et relâchement changent avec le tempo. Le modèle de timbre est aussi utilisé pour la classification des sons dans des classes d’instruments avec de très bons resultats. Finalement, des tests d’ecoute de tous les modèles ont permis de conclure que le meilleur modèle de timbre possède une qualité de son acceptable. Acknowledgments First and foremost, my thanks go to Jens Arnspang, who has created the music informatics group at the Computer Science Department at the University of Copenhagen, and without whom this work would never have started. Jens accepted to be my supervisor and had the open mind to let me pursue my own directions, and detours. Secondly, my thanks go to the two members of the monitor group, Ivar Frounberg and Holger Rindel for insightful comments and feedback both in the musical and the technical domain. The comments from them helped keep a focus in my work, and inspired further improvements. This work has been financed by the Danish Technical Research Council whom I thank. My thanks go to all members, past or present, of the music informatics group. Special thanks go to Klaus Hansen for invaluable help. Fruitful discussions with Stefan Borum, Anders Møller and Esben Skovenberg have also been a great source of inspiration. Sincere thanks goes to the musicians who accepted to spend time to record sounds which is not music. The musical instrument sound database created with their help has been instrumental in this work. The judgments and comments from the members of the listening tests have also been a great help. My sincere thanks go to all the participants in the listening tests. Many helpful comments have also come from other groups in the computer science department. Special thanks to Ketil Perstrup, Kristian Pilgaard, Jon Sporring, Joachim Weickert, Peter Riber, Stig Skelboe, Knud Henriksen, Erik Frøkjær and Morten Hanehøj. The image group and notably the scale-space community have been a great source of inspiration. My thoughts go to everybody at DIKU who have made my stay here so pleasant. Part of the thesis work in Denmark is passed in a different research institution, as required by the Danish Ph.D. circular. I was very lucky to be accepted at the Groupe Informatique Musical at the Laboratoire Mecanique et Acoustique in Marseille, France. My sincere thanks go to Jean-Claude Risset for having accepted me in his group, and to Richard Kronland-Martinet and Philippe Guillemain for help and discussions. My stay at the groupe informatique musicale was made agreeable by the fruitful discussions and the nice atmosphere in the group. A final thanks goes to Carol Jensen, Thomas Jensen and all the members of my family, and especially to Alice, who hopefully will see more of Papa soon.

[1]  Joachim Weickert,et al.  Coherence-enhancing diffusion of colour images , 1999, Image Vis. Comput..

[2]  R. Plomp,et al.  Tonal consonance and critical bandwidth. , 1965, The Journal of the Acoustical Society of America.

[3]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[4]  E. Robinson,et al.  A historical perspective of spectrum estimation , 1982, Proceedings of the IEEE.

[5]  Barry Truax,et al.  Discovering Inner Complexity: Time Shifting and Transposition with a Real-Time Granulation Technique , 1994 .

[6]  P. Depalle,et al.  Spectral Envelopes and Inverse FFT Synthesis , 1992 .

[7]  Xavier Serra,et al.  Integrating complementary spectral models in the design of a musical synthesizer , 1997, ICMC.

[8]  J. C. Risset,et al.  Computer Study of Trumpet Tones , 1965 .

[9]  Gaël Richard Modelisation de la composante stochastique de la parole , 1994 .

[10]  J. Moorer The Synthesis of Complex Audio Spectra by Means of Discrete Summation Formulas , 1976 .

[11]  Stephen Wolfram,et al.  The Mathematica Book , 1996 .

[12]  Roel Vertegaal,et al.  Comparison of Input Devices in an ISEE Direct Timbre Manipulation Task , 1996, Interact. Comput..

[13]  R. Patterson,et al.  A pulse ribbon model of monaural phase perception. , 1987, The Journal of the Acoustical Society of America.

[14]  R Veldhuis,et al.  A computationally efficient alternative for the Liljencrants-Fant model and its perceptual evaluation. , 1998, The Journal of the Acoustical Society of America.

[15]  Richard Kronland-Martinet,et al.  Estimation of Parameters Corresponding to A Propagative Synthesis Model Through the Analysis of Real Sounds , 1996, ICMC.

[16]  Giovanni De Poli Sound Synthesis by Fractional Waveshaping , 1984 .

[17]  Julius O. Smith,et al.  Alias-Free Digital Synthesis of Classic Analog Waveforms , 1996, ICMC.

[18]  John Strawn,et al.  Approximation and Syntactic Analysis of Amplitude and Frequency Functions for Digital Sound Synthesis , 1980, ICMC.

[19]  Andrew Sekey,et al.  Improved 1‐bark bandwidth auditory filters , 1983 .

[20]  David Wessel,et al.  Timbre Space as a Musical Control Structure , 1979 .

[21]  Xavier Serra,et al.  A sound analysis/synthesis system based on a deterministic plus stochastic decomposition , 1990 .

[22]  Marcelo M. Wanderley,et al.  Instrumental Gestural Mapping Strategies as Expressivity Determinants in Computer Music Performance , 1997 .

[23]  Marcelo M. Wanderley,et al.  Gestural research at IRCAM: A progress report , 1998 .

[24]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[25]  John Michael Strawn,et al.  Modeling musical transitions , 1985 .

[26]  E. Terhardt On the perception of periodic sound fluctuations (roughness) , 1974 .

[27]  S. S. Stevens,et al.  Critical Band Width in Loudness Summation , 1957 .

[28]  R. Plomp,et al.  Effect of phase on the timbre of complex tones. , 1969, The Journal of the Acoustical Society of America.

[29]  Pierre Schaeffer Traité des objets musicaux , 1966 .

[30]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31]  Curtis Roads,et al.  Introduction to Granular Synthesis , 1988 .

[32]  Xavier Rodet,et al.  Speech analysis and synthesis methods based on spectral envelopes and voiced/unvoiced functions , 1987, ECST.

[33]  W. Strong,et al.  Synthesis of Wind‐Instrument Tones , 1967 .

[34]  J. Risset,et al.  Exploration of timbre by analysis and synthesis , 1999 .

[35]  James A Moorer The Hetrodyne Filter as a Tool for Analysis of Transient Waveforms , 1973 .

[36]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[37]  J. J. Moré,et al.  Levenberg--Marquardt algorithm: implementation and theory , 1977 .

[38]  Aage R. Møller,et al.  Basic Mechanisms in Hearing , 1973 .

[39]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[40]  Kenneth Steiglitz,et al.  A digital signal processing primer - with applications to digital audio and computer music , 1996 .