Experimental Study of Using Spectrum-based Features for Structural Representation of Speech

Non-linguistic factors such as morphological differences in vocal tracts inevitably affect acoustic features of speech. Recently, a new speech representation, called as structural representation, was proposed which is completely independent of these factors. In the representation, the absolute property of speech events is totally discarded and their relative property is only captured and modeled. In the previous studies, all the discussions on this new representation were done using cepstrum-based features. In this report, spectrum-based features are used for the structural representation and tested for speech recognition. Mathematical and experimental discussions show the followings. 1) The spectrum-based structural representation also has strong speaker-invariance. 2) It can show a better performance of noisy speech recognition compared to cepstrum-based structures. 3) It shows a rather similar performance to humans when noise vocoded speech samples are tested. Finally, we discuss the validity of the spectrum-based structural speech recognition as a model of human speech perception.