Segmentation Scheme for Use of a Speech Recognition Computer Program

A segmentation scheme is being developed to operate as a first step in a general speech recognition program. Input to the segmenter is sampled, quantized spectral data (35 frequency channels scanned every 5.5 msec). The output is a sequence of boundary markers together with a rough classification of the enclosed segments. The segments are generally phonemic, or smaller, in size, and the classification corresponds either to type of phoneme (e.g., fricative, nasal, vowel) or to sound characteristic (e.g., silence, aspiration). The segmenter operates by first classifying each 5.5‐msec scan of the spectral data. This classification is obtained by computing a number of measurements of different attributes of the spectral pattern and combining the results with suitable weighting functions. The number of such measurements is made larger than the logical minimum to avoid placing too much dependence on any one measurement. Segments are then built by grouping together scans which have the same individual classifica...