Automatic Segmentation of Croatian Speech

In the paper we described an automatic segmentation procedure for Croatian speech, which is based on a monophone speech recognition system and on word level transcriptions of speech signals. Automatic segmentation results are presented for 13 hours of 25 professional speaker’ s speech. Indirect measures used for the automatic speech segmentation performance are phoneme recognition correctness and word recognition correctness and accuracy. Additionally the Croatian phoneme duration was calculated from automatically segmented phones. Typical segmentation errors noticed in the results of the automatic segmentation procedure are presented and discussed. The described automatic speech segmentation result is used to build a Croatian large vocabulary speech recognition system and HMM based speech synthesis system, which present modules of the spoken dialog system for the enquiry of weather conditions and forecasts in Croatia.