The effects of high quality translations of named entities in cross-language information exploration

Named entities (NEs) are the expressions in human languages that explicitly link notations in languages to the entities in the real world. They play important role in cross-language information retrieval (CLIR) because most users' requests have been found to have NEs, and majority of out-of-vocabulary terms are NEs. Therefore, missing their translations has a significant impact to the retrieval effectiveness. In this paper, we examined the effect of high quality translations of NEs in event driven information exploration, where the existence of NEs is even more common. With the focus on the effect of NE translations obtained by using information extraction (IE) techniques, we conducted several experiments using TDT test collections. Our results demonstrate that NEs and their translations play critical roles in improving CLIR effectiveness, and it makes positive impact in CLIR to use high quality translations of NEs obtained by IE techniques.

[1]  Yaser Al-Onaizan,et al.  Machine Transliteration of Names in Arabic Texts , 2002, SEMITIC@ACL.

[2]  Peter F. Patel-Schneider,et al.  DLP System Description , 1998, Description Logics.

[3]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[4]  Douglas W. Oard,et al.  The effect of bilingual term list size on dictionary-based cross-language information retrieval , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[5]  Max Schroeder When you come to a fork in the road... take it , 2000 .

[6]  William N. Dember If you come to a fork in the road. , 1998 .

[7]  Jianqiang Wang,et al.  Combining bidirectional translation and synonymy for cross-language information retrieval , 2006, SIGIR.

[8]  Sanjeev Khudanpur,et al.  Transliteration of proper names in cross-language applications , 2003, SIGIR.

[9]  Ralph Grishman,et al.  NYU's English ACE 2005 System Description , 2005 .

[10]  Douglas W. Oard,et al.  Probabilistic structured query methods , 2003, SIGIR.

[11]  Heng Ji,et al.  NYU-Fair Issac-RWTH Chinese to English entity translation 07 system , 2007 .

[12]  David B. Seaburn When you come to the fork in the road. , 2002 .

[13]  Long Jiang,et al.  Named Entity Translation with Web Mining and Transliteration , 2007, IJCAI.

[14]  Charles H. Davis American Society for Information Science , 1984 .

[15]  Richard M. Schwartz,et al.  Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[16]  Thomas Mandl,et al.  The effect of named entities on effectiveness in cross-language information retrieval evaluation , 2005, SAC '05.

[17]  Fredric C. Gey,et al.  Cross language information retrieval: a research roadmap , 2002, SIGF.

[18]  Dan Wu,et al.  Ice-tea: an interactive cross-language search engine with translation enhancement , 2008, SIGIR '08.

[19]  Dagobert Soergel,et al.  Annual Review of Information Science and Technology: Edited by Martha E. Williams. Volume 34, 1999. Medford, NJ: Information Today for American Society for Information Science and Technology (ASIST), 2001. 579 pp. $79.95 (ASIST members); $99.95 (nonmembers) ISBN 1-57387-093-5. , 2003 .

[20]  S. Sekine,et al.  The SemEval-2007 WePS Evaluation: Establishing a benchmark for the Web People Search Task , 2007, *SEMEVAL.

[21]  Alexander H. Waibel,et al.  Extracting named entity translingual equivalence with limited resources , 2003, TALIP.

[22]  Julio Gonzalo,et al.  The SemEval-2007 WePS Evaluation: Establishing a benchmark for the Web People Search Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[23]  Hsin-Hsi Chen,et al.  Proper Name Translation in Cross-Language Information Retrieval , 1998, COLING-ACL.

[24]  Yogi Berra,et al.  When You Come to a Fork in the Road, Take It! , 2001 .

[25]  Leah S. Larkey,et al.  Statistical transliteration for english-arabic cross language information retrieval , 2003, CIKM '03.