Perceptual Importance of the Phase Related Information in Speech

The importance of phase information in the perceptual quality of the speech signals is studied in this paper. Many speech synthesisers do not use the original phase information of the signals assuming their contribution is almost inaudible. The Relative Phase Shift (RPS) representation of the phase allows straightforward phase structure analysis, manipulation and resynthesis, and we use these features to do a comparative evaluation of some phase modifications usually found in speech models. The final intention of this study is to get an answer to the question of whether phases deserve elaborate models to get high quality synthetic speech, or their subtle effects justify overlooking them.

[1]  Satoshi Nakamura,et al.  Efficient representation of short-time phase based on group delay , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  I. Saratxaga,et al.  Simple representation of signal phase for harmonic speech models , 2009 .

[3]  Kuldip K. Paliwal,et al.  Evaluation of the modified group delay feature for isolatedword recognition , 2005, Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005..

[4]  Doh-Suk Kim On the perceptually irrelevant phase information in sinusoidal representation of speech , 2001, IEEE Trans. Speech Audio Process..

[5]  J. Pickles An Introduction to the Physiology of Hearing , 1982 .

[6]  Julius O. Smith,et al.  Introduction to Digital Filters: with Audio Applications , 2007 .

[7]  W. Bastiaan Kleijn,et al.  On phase perception in speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  Ibon Saratxaga,et al.  Using harmonic phase information to improve ASR rate , 2010, INTERSPEECH.

[9]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[10]  Günther Palm,et al.  Effects of phase on the perception of intervocalic stop consonants , 1997, Speech Commun..

[11]  Yannis Stylianou Removing linear phase mismatches in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..

[12]  Ibon Saratxaga,et al.  Detection of synthetic speech for the problem of imposture , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  John Vanderkooy,et al.  On the Audibility of Midrange Phase Distortion in Audio Systems , 1980 .

[14]  Shaila D. Apte,et al.  Speech and Audio Processing , 2012 .