An Automatic Algorithm for Locating the Beginning and End of an Utterance Using ADPCM Coded Speech

An automatic algorithm is described for locating the endpoints of an utterance which has been digitized using adaptive, differential coding techniques (ADPCM) [P. Cummiskey, N. S. Jayant, and J. L. Flanagan, Proc. IEEE Int. Coram. Conf., Seattle (June 1973)]. The ADPCM coder operates at a 6‐kHz sampling rate, and adaptively quantizes the speech using four‐bit samples (the code words). Since the ADPCM coder effectively has an automatic gain compression and expansion or companding feature, it is possible to distinguish between silence and speech by simple measurements on the code words. In the actual implementation of the endpoint location algorithm, a threshold is set on the energy in the code words. The beginning of an utterance is defined as the time at which the code word energy exceeds the threshold for a fixed duration. The end of an utterance is defined as the time at which the code word energy falls below the threshold for another fixed duration. Extensive testing of the algorithm on single words an...