Spoken query based word spotting in digitized Tamil documents

This paper presents an integrated approach to spot the spoken keywords in digitized Tamil documents by combining word image matching and spoken word recognition techniques. The work involves the segmentation of document images into words, creation of an index of keywords, and construction of word image hidden Markov model (HMM) and speech HMM for each keyword. The word image HMMs are constructed using seven dimensional profile and statistical moment features and used to recognize a segmented word image for possible inclusion of the keyword in the index. The spoken query word is recognized using the most likelihood of the speech HMMs using the 39 dimensional mel frequency cepstral coefficients derived from the speech samples of the keywords. The positional details of the search keyword obtained from the automatically updated index retrieve the relevant portion of text from the document during word spotting. The performance measures such as recall, precision, and F-measure are calculated for 40 test words from the four groups of literary documents to illustrate the ability of the proposed scheme and highlight its worthiness in the emerging multilingual information retrieval scenario.

[1]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[2]  Frank Lebourgeois,et al.  Towards an omnilingual word retrieval system for ancient manuscripts , 2009, Pattern Recognit..

[3]  Rafael C. González,et al.  Local Determination of a Moving Contrast Edge , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Fabio Crestani,et al.  Spoken query processing for interactive information retrieval , 2002, Data Knowl. Eng..

[5]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[6]  Lin-Shan Lee,et al.  Voice-based information retrieval — how far are we from the text-based information retrieval ? , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[7]  Dessi Puji Lestari,et al.  Adaptation to Pronunciation Variations in Indonesian Spoken Query-Based Information Retrieval , 2010, IEICE Trans. Inf. Syst..

[8]  Ali Broumandnia,et al.  Independent-speaker isolated word speech recognition based on mean-shift framing using hybrid HMM/SVM classifier , 2010, 2010 18th Iranian Conference on Electrical Engineering.

[9]  Hossein Sameti,et al.  A novel approach to HMM-based speech recognition system using particle swarm optimization , 2009, 2009 Fourth International on Conference on Bio-Inspired Computing.

[10]  奥村 明俊,et al.  Speech-Activated Text Retrieval System for Cellular Phones with Web Browsing Capability , 2005 .

[11]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[12]  Horia CUCU,et al.  OPTIMIZATION METHODS FOR LARGE VOCABULARY , ISOLATED WORDS RECOGNITION IN ROMANIAN LANGUAGE , 2011 .

[13]  R. Manmatha,et al.  Features for word spotting in historical manuscripts , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[14]  Ryosuke Isotani,et al.  Speech-Activated Text Retrieval System for Cellular Phones with Web Browsing Capability , 2005, PACLIC.

[15]  William H. Press,et al.  Numerical recipes in C , 2002 .

[16]  Mark J. F. Gales,et al.  The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..