Phase Effects on Speech and Its Influence on Warped Speech

A method of transforming speech from one speaker's voice to another is discussed, which operates by moving speech magnitude information from a source speaker to a target speaker using a process involving dynamic warping in both the time domain and the frequency domain. This process involves only spectral magnitudes, and has been found to introduce significant deleterious signal processing artifacts. It has been found that by reconstruction of phase information significantly improves the quality of the transformed speech.

[1]  M. Portnoff,et al.  Time-scale modification of speech based on short-time Fourier analysis , 1981 .

[2]  Todd K. Moon,et al.  Voice Transformation Using Two-Level Dynamic Warping , 2019, 2019 53rd Asilomar Conference on Signals, Systems, and Computers.

[3]  Georgios Evangelidis,et al.  Continuous Action Recognition Based on Sequence Alignment , 2014, International Journal of Computer Vision.

[4]  Jae S. Lim,et al.  Signal estimation from modified short-time Fourier transform , 1983, ICASSP.

[5]  Ibon Saratxaga,et al.  Perceptual Importance of the Phase Related Information in Speech , 2012, INTERSPEECH.

[6]  Todd K. Moon,et al.  A Tool for Training Speech Imitation Accuracy , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[7]  Zhang Linghua,et al.  Vocal tract spectrum transformation based on clustering in voice conversion system , 2012, 2012 IEEE International Conference on Information and Automation.

[8]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[9]  Zhengzhong Bian,et al.  Study on phase perception in speech , 2003 .

[10]  Zdenek Prusa The Phase Retrieval Toolbox , 2017, Semantic Audio.

[11]  Yannis Stylianou,et al.  Voice Transformation: A survey , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Doh-Suk Kim Perceptual phase redundancy in speech , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).