论文信息 - Voicesauce: A Program for Voice Analysis

Voicesauce: A Program for Voice Analysis

VOICESAUCE is a new application, implemented in MATLAB, which provides automated voice measurements over time from audio recordings. The measures currently computed are F0, H1(*), H2(*), H4(*), H1(*)‐H2(*), H2(*)‐H4(*), H1(*)‐A1, H1(*)‐A2, H1(*)‐A3, energy, Cepstral Peak Prominence, F1–F4, and B1–B4, where (*) indicates that harmonic amplitudes are reported with and without corrections for formant frequencies and bandwidths [Iseli et al. (2006)]. Formant values are calculated using the Snack Sound Toolkit, while F0 is calculated using the STRAIGHT algorithm; harmonic spectra magnitudes are computed pitch‐synchronously. VOICESAUCE takes as input a folder of wav files, and for each input wav file produces a MATLAB file with values every millsecond for all measures. It can operate over the whole input file or over segments delimited by a PRAAT textgrid file. VOICESAUCE then takes these MATLAB outputs, optionally along with electroglottographic measurements obtained separately from PCQUIRERX, and provides con...

[1] Abeer Alwan,et al. Inter- and intra-speaker variability of glottal flow derivative using the LF model , 2000, INTERSPEECH.

[2] Guus de Krom,et al. A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals , 1993 .

[3] Chao-Yang Lee,et al. Identifying isolated, multispeaker Mandarin tones from brief acoustic input: a perceptual and acoustic study. , 2009, The Journal of the Acoustical Society of America.

[4] Abeer Alwan,et al. An improved correction formula for the estimation of harmonic magnitudes and its application to open quotient estimation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] J. Hillenbrand,et al. Acoustic correlates of breathy vocal quality. , 1994, Journal of speech and hearing research.

[6] Abeer Alwan,et al. Age-and Gender-Dependent Analysis of Voice Source Characteristics , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[7] P. Boersma. Praat : doing phonetics by computer (version 5.1.05) , 2009 .

[8] Roy D. Patterson,et al. An instantaneous-frequency-based pitch extraction method for high-quality speech transformation: revised TEMPO in the STRAIGHT-suite , 1998, ICSLP.

[9] Abeer Alwan,et al. Age, sex, and vowel dependencies of acoustic measures related to the voice source. , 2007, The Journal of the Acoustical Society of America.

[10] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .

[11] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[12] Xuejing Sun,et al. Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13] J W Hawks,et al. A formant bandwidth estimation procedure for vowel synthesis [43.72.Ja]. , 1995, The Journal of the Acoustical Society of America.

[14] Jody Kreiman,et al. Toward a taxonomy of nonmodal phonation , 2001, J. Phonetics.

[15] H M Hanson,et al. Glottal characteristics of female speakers: acoustic correlates. , 1997, The Journal of the Acoustical Society of America.

[16] G. de Krom. A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals. , 1993, Journal of speech and hearing research.

[17] Christina M. Esposito. Variation in contrastive phonation in Santa Ana Del Valle Zapotec , 2010, Journal of the International Phonetic Association.

[18] Chad Vicenik,et al. An acoustic study of Georgian stop consonants , 2010, Journal of the International Phonetic Association.

[19] Jonathan Harrington,et al. Phonetic Analysis of Speech Corpora , 2010 .