Sentence boundary detection in arabic speech
This paper presents an automatic system to detect sentence boundaries in speech recognition transcripts. Two systems were developed that use independent sources of information. One is a linguistic system that uses linguistic features in a statistical language model while the other is an acoustic system that uses prosodic features in a feed-forward neural network model. A third system was developed that combines the scores from the acoustic and the linguistic systems in a Maximum-Likelihood framework. All systems outlined in this paper are essentially language-independent but all our experiments were conducted on the Arabic Broadcast News speech recognition transcripts. Our experiments show that while the acoustic system outperforms the linguistic system, the combined system achieves the best performance at detecting sentence boundaries.
[1] Andreas Stolcke,et al. Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? , 1998, Language and speech.