Periodicity extraction for voiced sounds with multiple periodicity

A periodicity extraction method is introduced to analyze voiced sounds with a complex excitation behavior. Although general voiced sound has only one periodicity, some voiced sounds such as the pathological voice and the singing voice often have multiple periodicities. A method for estimating multiple periodicities from voiced sounds to deal with these kinds of voices is proposed in this article. At first, a definition of the multiple periodicity and its causes are explained, and then the principle of the proposed method is introduced. The proposed method was evaluated by using several artificial signals and pathological voices recorded in a real environment. The analysis results from the artificial signals indicated that the proposed method can extract multiple periodicities, and that of the pathological voices shows a similar tendency. These results suggest that the proposed method is effective at extracting the multiple periodicities.

[1]  John G Harris,et al.  A sawtooth waveform inspired pitch estimator for speech and music. , 2008, The Journal of the Acoustical Society of America.

[2]  J. C. Williams,et al.  Noh voice quality , 2009, Logopedics, phoniatrics, vocology.

[3]  M. Mathews,et al.  Pitch Synchronous Analysis of Voiced Sounds , 1961 .

[4]  Hideki Kawahara,et al.  Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  M. Ross,et al.  Average magnitude difference function pitch extractor , 1974 .

[6]  Hideki Kawahara,et al.  Multiple period estimation and pitch perception model , 1999, Speech Commun..

[7]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals , 1983 .

[8]  Hideki Kawahara,et al.  Analysis and synthesis of strong vocal expressions: Extension and application of audio texture features to singing voice , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Hideki Kawahara,et al.  Deviation measure of waveform symmetry and its application to high-speed and temporally-fine F0 extraction for vocal sound texture manipulation , 2012, INTERSPEECH.

[10]  Ronald A. Cole,et al.  Pitch detection with a neural-net classifier , 1991, IEEE Trans. Signal Process..

[11]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[12]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[13]  A. Noll Short‐Time Spectrum and “Cepstrum” Techniques for Vocal‐Pitch Detection , 1964 .

[14]  Takao Kobayashi,et al.  Harmonics tracking and pitch extraction based on instantaneous frequency , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[15]  Hideki Kawahara,et al.  Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT , 2005, INTERSPEECH.

[16]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[17]  Hideki Kawahara,et al.  A bottom-up procedure to extract periodicity structure of voiced sounds and its application to represent and restoration of pathological voices , 2009, MAVEBA.