Para-Linguistic Information Represented as Distortion of the Acoustic Universal Structure In Speech

Speech acoustics varies from speaker to speaker, microphone to microphone, etc. Recently, a novel method was proposed to separate these static non-linguistic features from speech as spectral smoothing can separate pitch information from speech (N. Minematsu, 2005). Absolute properties of speech events, such as formants and spectrums, are completely discarded and only the phonic differences or contrasts between the events are extracted to form their external structure. This structure is called the acoustic universal structure and regarded as physical implementation of structural phonology because the structure is considered to represent only the linguistic and para-linguistic information. In this paper, the structural size is focused on and its correlation with the para-linguistic information is examined. Results showed that the size can be interpreted as magnitude of articulatory efforts made in speech production

[1]  Hermann Ney,et al.  Vocal tract normalization equals linear transformation in cepstral space , 2001, IEEE Transactions on Speech and Audio Processing.

[2]  Roman Jakobson,et al.  Notes on the French Phonemic Pattern , 1949 .

[3]  Keikichi Hirose,et al.  Japanese vowel recognition based on structural representation of speech , 2005, INTERSPEECH.

[4]  K. Hirose,et al.  Japanese vowel recognition using external structure of speech , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[5]  Nobuaki Minematsu Pronunciation assessment based upon the phonological distortions observed in language learners' utterances , 2004, INTERSPEECH.

[6]  Nobuaki Minematsu Mathematical evidence of the acoustic universal structure in speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..