Interactive clustering techniques for selecting speaker-independent reference templates for isolated word recognition

It is demonstrated that clustering can be a powerful tool for selecting reference templates for speaker-independent word recognition. We describe a set of clustering techniques specifically designed for this purpose. These interactive procedures identify coarse structure, fine structure, overlap of, and outliers from clusters. The techniques have been applied to a large speech data base consisting of four repetitions of a 39 word vocabulary (the letters of the alphabet, the digits, and three auxiliary commands) spoken by 50 male and 50 female speakers. The results of the cluster analysis show that the data are highly structured containing large prominent clusters. Some statistics of the analysis and their significance are presented.