Dynamic recognition of vowels by machine using trajectories in a two dimensional feature space

Two real values features derived from vowel formants in every 10-ms time frame, are plotted in the plane to form a trajectory. The trajectories are analyzed geometrically to extract stationary regions and turning points, and to fit straight lines to suitable parts. Relating these to "ideal" positions for six basic vowels, a new set of dynamic features are derived and used for classification of already segmented vowels. Using a k-nearest neighbour rule with 2300 training vowels and as many test vowels, taken from continuous speech samples of the same group of 33 male speakers, an average success rate of 72% has been achieved in six way classification. This may be compared to 75-86% claimed for human subjects in similar tests, but with little training and much less data.<<ETX>>