Automatic estimation of speaking rate in multilingual spontaneous speech

An automatic estimation of speaking rate is developed in this paper. It is based on an unsupervised vowel detection algorithm and thus may be costlessly applied to any language. Validation is driven on a spontaneous speech subset of the OGI Multilingual Telephone Speech Corpus. The correlation coefficient between the estimated and real speaking rates (evaluated in term of vowel-per-second rates) is 0.84 on average among the 6 languages for which a phonetic transcription is available (English, German, Hindi, Japanese, Mandarin and Spanish).