SMS based FAQ Retrieval for Hindi, English and Malayalam

This paper presents our approach for the SMS-based FAQ Retrieval monolingual task in FIRE 2012 and FIRE 2013. Current approach predicts the matching of an SMS and FAQs more accurately as compared to our previous solution for this task which was submitted in FIRE 2011. We provide solution for SMS and FAQs matching in Malayalam language (an Indian language) in addition to Hindi and English this time. In order to perform a matching between SMS queries and FAQ database, we introduce enhanced similarity score, proximity score, enhanced length score and an answer matching system. We introduce the stemming of terms and consider the effects of joining adjacent terms in SMS query and FAQ to improve the similarity score. We propose a novel method to normalize FAQ and SMS tokens to improve the accuracy for Hindi language. Moreover, we suggest a few character substitutions to handle error in the SMS query. We demonstrate the effectiveness of our approach by considering many real-life FAQ-datasets provided by FIRE from a number of different domains such as Health, Telecom, Insurance and Railway booking. Experimental results confirm that our solution for the SMS-based FAQ Retrieval monolingual task is very encouraging and among the top submissions which performed very well for English, Hindi and Malayalam. The Mean Reciprocal Rank (MRR) scores for our approach are 0.971, 0.973 and 0.761 respectively for English, Hindi and Malayalam SMS-based FAQ Retrieval monolingual task in FIRE 2012. Furthermore, our solution topped the task for Hindi language with MRR score equal to 0.971 in FIRE 2013. Our approach performs very well for English language as well in FIRE 2013 despite transcripts of the speech queries are included in test dataset along with the normal SMS queries.

[1]  Yi Yu,et al.  ATLAS: Automatic Temporal Segmentation and Annotation of Lecture Videos Based on Modelling Transition Time , 2014, ACM Multimedia.

[2]  Lakshminarayanan Subramanian,et al.  SMS-based web search for low-end mobile devices , 2010, MobiCom.

[3]  James Mayfield,et al.  Character N-Gram Tokenization for European Language Text Retrieval , 2004, Information Retrieval.

[4]  Johannes Leveling,et al.  DCU@FIRE 2011: SMS-based FAQ Retrieval , 2011 .

[5]  L. Venkata Subramaniam,et al.  Text Retrieval Using SMS Queries: Datasets and Overview of FIRE 2011 Track on SMS-Based FAQ Retrieval , 2011, FIRE.

[6]  Iadh Ounis,et al.  Detecting Missing Content Queries in an SMS-Based HIV/AIDS FAQ Retrieval System , 2014, ECIR.

[7]  Rohiza Ahmad,et al.  SMS-based final exam retrieval system on mobile phones , 2010, 2010 International Symposium on Information Technology.

[8]  L. Venkata Subramaniam,et al.  Handling Noisy Queries in Cross Language FAQ Retrieval , 2010, EMNLP.

[9]  Eriks Sneiders,et al.  Automated Question Answering Using Question Templates That Cover the Conceptual Model of the Database , 2002, NLDB.

[10]  Johannes Leveling,et al.  On the Effect of Stopword Removal for SMS-Based FAQ Retrieval , 2012, NLDB.

[11]  Iadh Ounis,et al.  Evaluating bad query abandonment in an iterative SMS-based FAQ retrieval system , 2013, OAIR.

[12]  W. B. Cavnar,et al.  Using An N-Gram-Based Document Representation With A Vector Processing Retrieval Model , 1994, TREC.

[13]  Padmini Srinivasan,et al.  Data-Driven Methods for SMS-Based FAQ Retrieval , 2011, FIRE.

[14]  Yoichi Shinoda,et al.  Information filtering based on user behavior analysis and best match text retrieval , 1994, SIGIR '94.

[15]  Sunil Kumar Kopparapu,et al.  SMS based natural language interface to yellow pages directory , 2007, Mobility '07.

[16]  L. Venkata Subramaniam,et al.  SMS based Interface for FAQ Retrieval , 2009, ACL.

[17]  Jacques Savoy,et al.  Comparative Study of Indexing and Search Strategies for the Hindi, Marathi, and Bengali Languages , 2010, TALIP.

[18]  Manoj Kumar,et al.  Improving Accuracy of SMS Based FAQ Retrieval System , 2011, FIRE.

[19]  Martin Porter,et al.  Snowball: A language for stemming algorithms , 2001 .

[20]  Kerry Rodden,et al.  Mobile search with text messages: designing the user experience for google SMS , 2005, CHI EA '05.