Automatic labelling of continuous speech with a given phonetic transcription using dynamic programming algorithms

A system is described which allows the mapping of a phonetic transcription onto an acoustic parameter representation of continuous speech. Linear prediction analysis, segmentation and formant tracking provide the acoustic parameters on a 5 ms time frame basis and a sequence of voiced, unvoiced and silent segments. The given phonetic transcription is expanded to include implicit phone sequences and transitions. Labelling is then performed in two stages. Segment labelling maps substrings of the expanded phone string onto the acoustic segments using a dynamic programming algorithm. The acoustic and phonetic units are correlated directly by means of a table of acoustic-phonetic rules. Frame labelling labels each time frame with a single phone using another dynamic programming algorithm based on the derivatives of energy and formant functions. The method is found to objectify and considerably facilitate the establishment of a time-locked acoustic-phonetic database.