Metrics and Morphing of Power Spectra

Spectral analysis of time-series has been an important tool of science for a very long time. Indeed, periodicities of celestial events and of weather phenomena sparked the curiosity and imagination of early thinkers in the history of science. More recently, the refined mathematical techniques for spectral analysis of the past fifty years form the basis for a wide range of technological developments from medical imaging to communications. However, in spite of the centrality of a spectral representation of time-series, no universal agreement exists on what is a suitable metric between such representations. In this paper we discuss three alternative metrics along with their application in morphing speech signals. Morphing can be naturally effected via a deformation of power spectra along geodesics of the corresponding geometry. The acoustic effect of morphing between two speakers is documented at a website.

[1]  Masanobu Abe,et al.  Speech morphing by gradually changing spectrum parameter and fundamental frequency , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  Tryphon T. Georgiou,et al.  Distances and Riemannian Metrics for Spectral Density Functions , 2007, IEEE Transactions on Signal Processing.

[3]  A. Gray,et al.  Least squares glottal inverse filtering from the acoustic speech waveform , 1979 .

[4]  L. Ambrosio Lecture Notes on Optimal Transport Problems , 2003 .

[5]  Hideki Kawahara,et al.  Auditory morphing based on an elastic perceptual distance metric in an interference-free time-frequency representation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  Petre Stoica,et al.  Introduction to spectral analysis , 1997 .

[7]  Aaron E. Rosenberg,et al.  On reducing the buzz in LPC synthesis , 1977 .

[8]  Hui Ye,et al.  Perceptually weighted linear transformations for voice conversion , 2003, INTERSPEECH.

[9]  Donald G. Childers,et al.  Speech processing and synthesis toolboxes , 1999 .

[10]  C. Villani Topics in Optimal Transportation , 2003 .

[11]  Lei Zhu,et al.  Optimal Mass Transport for Registration and Warping , 2004, International Journal of Computer Vision.

[12]  H. Pfitzinger Unsupervised Speech Morphing between Utterances of any Speakers , 2004 .

[13]  Vladimir Goncharoff,et al.  Interpolation of LPC spectra via pole shifting , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.