Answering English Questions using Foreign-Language, Semi-Structured Sources

Despite continuing advances in machine translation technology, users who lack familiarity with particular foreign languages have no good way to find information in those languages. In this paper, we present a technical framework and implemented system that answers English questions on the basis of information in foreign-language, semi-structured sources such as websites. This work helps users locate, with high precision, relevant segments of foreign- language information, and then makes use of existing machine translation services to present that information in English. The resulting technology extends an approach embodied in the START information access system and its supporting Omnibase uniform data access system, and it has been applied to several Chinese and Arabic websites.

[1]  Boris Katz,et al.  Exploiting Lexical Regularities in Designing Natural Language Systems , 1988, COLING.

[2]  Boris Katz,et al.  Using English for Indexing and Retrieving , 1991 .

[3]  Boris Katz,et al.  Annotating the World Wide Web using Natural Language , 1997, RIAO.

[4]  Maarten de Rijke,et al.  The Multiple Language Question Answering Track at CLEF 2003 , 2003, CLEF.

[5]  Boris Katz,et al.  Syntactic and Semantic Decomposition Strategies for Question Answering from Multiple Resources * , 2005 .

[6]  Gabriel Zaccak,et al.  Wrapster : semi-automatic wrapper generation for semi-structured websites , 2007 .

[7]  Baris Temelkuran Hap-Shu A Language for Locating Information in HTML Documents , 2003 .

[8]  Jimmy J. Lin,et al.  Omnibase: Uniform Access to Heterogeneous Data for Question Answering , 2002, NLDB.

[9]  Jason S. Chang,et al.  Acquisition of English-Chinese Transliterated Word Pairs from Parallel-Aligned Texts using a Statistical Machine Transliteration Model , 2003, ParallelTexts@NAACL-HLT.

[10]  Leah S. Larkey,et al.  Statistical transliteration for english-arabic cross language information retrieval , 2003, CIKM '03.

[11]  Yaser Al-Onaizan,et al.  Machine Transliteration of Names in Arabic Texts , 2002, SEMITIC@ACL.

[12]  Hsin-Hsi Chen,et al.  Overview of the NTCIR-6 Cross-Lingual Question Answering (CLQA) Task , 2007, NTCIR.

[13]  Hsin-Hsi Chen,et al.  Overview of the NTCIR-5 Cross-Lingual Question Answering Task (CLQA1) , 2005, NTCIR.

[14]  Mansur Arbabi,et al.  Algorithms for Arabic name transliteration , 1994, IBM J. Res. Dev..

[15]  Karin M. Verspoor,et al.  Automatic English-Chinese name transliteration for development of multilingual resources , 1998, ACL.