Visual characterization of speech spectrograms

This paper describes a system that applies vision techniques to extract acoustic patterns in the speech spectrogram. By processing a spectrographic image through a set of edge detectors and combining their outputs, the system obtains two-dimensional objects that characterize the formant patterns and general spectral properties for vowels and consonants. As a validation of the approach, a limited vowel recognition experiment was performed on the "object" spectrograms. Preliminary results show that this processing technique retains relevant acoustic information necessary to identify the underlying phonetic representation.