Basic parameters in speech processing. The need for evaluation

As basic parameters in speech processing we regard pitch, duration, intensity, voice quality, signal to noise ratio, voice activity detection and strength of Lombard effect. Taking in account also adverse conditions the performance of many published algorithms to extract those parameters from the speech signal automatically is not known. A framework based on competitive evaluation is proposed to push algorithmic research and to make progress comparable.

[1]  Andreas Stolcke,et al.  Modeling duration patterns for speaker recognition , 2003, INTERSPEECH.

[2]  Franz Gerl,et al.  Discriminatively Trained Context-Dependent Duration-Bigram Models for Korean Digit Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3]  Mahesh Viswanathan,et al.  Current status of the IBM Trainable Speech Synthesis System , 2001, SSW.

[4]  Andreas Stolcke,et al.  Modeling prosodic feature sequences for speaker recognition , 2005, Speech Commun..

[5]  Pavel Matejka,et al.  Phonotactic language identification using high quality phoneme recognition , 2005, INTERSPEECH.

[6]  Krzysztof Marasek,et al.  SPEECON – Speech Databases for Consumer Devices: Database Specification and Validation , 2002, LREC.

[7]  Petr Pollák,et al.  Design and collection of Czech Lombard speech database , 2005, INTERSPEECH.

[8]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[9]  Jordi Adell,et al.  Database Pruning for Unsupervised Building of Text-To-Speech Voices , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10]  Alan W. Black,et al.  Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[11]  Hartmut R. Pfitzinger,et al.  LOCAL SPEECH RATE PERCEPTION IN GERMAN SPEECH , 1999 .

[12]  Harald Höge,et al.  Evaluation of Pitch Detection Algorithms in Adverse Conditions , 2006 .

[13]  Bernt Andrassy,et al.  Human and machine recognition as a function of SNR , 2006, LREC.