Abstract: The purpose of summary of an article is to facilitate quick and accurate identification of the topic of published document. The objective is to save a prospective reader's time and effort in finding the useful information in a given article. This paper considers the task of text normalization in concatinative Text To Speech (TTS) synthesis for Kannada language. The main focus is to have a single document summarization tool based on statistical approach. This deals on how non standard Kannada words - acronyms, abbreviations, proper names derived from other languages or clutters, phone numbers, decimal numbers, fractions, ordinary numbers, sequence of numbers, money, dates, measures, titles, times and symbols - are preprocessed before passing it to the TTS system as an input. The paper also discusses about the methodology used to normalize the non Kannada text present in the input text to get an equivalent Kannada as output. The method uses a fast lexical analyzer, Jflex to scan the input to find the non standard words in the given input document.
[1]
Julia Zhang,et al.
Language generation and speech synthesis in dialogues for language learning
,
2004
.
[2]
Jennifer Balogh,et al.
Voice User Interface Design
,
2004
.
[3]
Kishore Prahallad,et al.
Text processing for text-to-speech systems in Indian languages
,
2007,
SSW.
[4]
PERI BHASKARARAO.
Salient phonetic features of Indian languages in speech technology
,
2011
.
[5]
Philip N. Garner,et al.
Current trends in multilingual speech processing
,
2011
.
[6]
Shankar Kumar,et al.
Normalization of non-standard words
,
2001,
Comput. Speech Lang..