The ISOLET spoken letter database

1 Description ISOLET is a database of letters of the English alphabet spoken in isolation. The database consists of 7800 spoken letters, 2 productions of each letter by 150 speakers. Each speaker is identiied by a string specifying their gender and initials followed by a number for uniqueness, e.g. \fbjt0" is a female with the initials \bjt." The utterances for each speaker are in a separate directory, one utterance per le. The speakers are organized into ve subsets: ISOLET-1, ISOLET-2, ISOLET-3, ISOLET-4 and ISOLET-5. Each subset contains utterances produced by 30 speakers, 15 male and 15 female. The grouping is arbitrary and roughly chronological. The total space used is 150 megabytes. 2 File Format The digitized speech les use a format similar to TIMIT 3, 2] adc les. Each le consists of a header followed by a series of 16 bit integers. The header and data are stored in big-endian format with respect to bytes (Sun format); the least signiicant byte is in the lowest address. The header has the following format: No. bytes Description 2 Size of header in 2 byte words 2 Version 2 Number of channels 2 Rate in quarter micro seconds 4 Number of samples 4 little-endian ag For ISOLET, the header size is 8 words. The version number is 1. The number of channels is 1. The rate is 250 quarter microseconds per sample, which is 62.5 microseconds per sample, or 16000 samples per second. The little-endian ag is 0.tute. The authors wish to thank Vincent Weatherill for recruiting and recording most of the speakers.

[1]  Ronald A. Cole,et al.  Speaker-independent recognition of spoken English letters , 1990, 1990 IJCNN International Joint Conference on Neural Networks.