Improving cross-language text retrieval with human interactions

Can we expect people to be able to get information from texts in languages they cannot read? We review two relevant lines of research bearing on this question and show how our results are being used in the design of a new Web interface for cross-language text retrieval. One line of research, "interactive IR", is concerned with the user interface issues for information retrieval systems such as how best to display the results of a text search. We review our current research, on "document thumbnail" visualizations, and discuss current Web conventions, practices and folklore. The other area of research, "Cross-Language Text Retrieval", is concerned with the design of automatic techniques, including Machine Translation, to retrieve texts in languages other than the language of the query. We review work we have done concerning query translation and multilingual text summarization. We then describe how these results are being applied and extended in the design a new demonstration interface, Keizai, an end-to-end, Web-based, cross-language text retrieval system. Beginning with an English query, the system will search Japanese and Korean Web data and display English summaries of the top ranking documents. A user should be able to accurately judge which foreign language documents are relevant to their information need and glean necessary information from the translation to schedule specific documents for human translation and subsequent analysis.

[1]  Mark W. Davis,et al.  A TREC Evaluation of Query Translation Methods For Multi-Lingual Text Retrieval , 1995, TREC.

[2]  Michael E. Lesk,et al.  Enhancing the usability of text through computer delivery and formative evaluation: the superbook pr , 1993 .

[3]  Marti A. Hearst,et al.  Presenting Web site search results in context: a demonstration , 1998, SIGIR '98.

[4]  Mark W. Davis,et al.  New Experiments In Cross-Language Text Retrieval At NMSU's Computing Research Lab , 1996, TREC.

[5]  Ellen M. Voorhees,et al.  The seventh text REtrieval conference (TREC-7) , 1999 .

[6]  M. Lynn Hawaii International Conference on System Sciences , 1996 .

[7]  Ellen M. Voorhees,et al.  The fifth text REtrieval conference (TREC-5) , 1997 .

[8]  Marti A. Hearst TileBars: visualization of term distribution information in full text information access , 1995, CHI '95.

[9]  Michael L. Littman,et al.  Automatic Cross-Language Retrieval Using Latent Semantic Indexing , 1997 .

[10]  G. W. Furnas,et al.  Generalized fisheye views , 1986, CHI '86.

[11]  Gerard Salton,et al.  Automatic Processing of Foreign Language Documents , 1969, COLING.

[12]  Philip Resnik,et al.  Evaluating Multilingual Gisting of Web Pages , 1997, ArXiv.

[13]  Arthur I. Karshmer,et al.  A hierarchical approach to detail + context views , 1998 .

[14]  Stephen G. Eick,et al.  Graphically Displaying Text , 1994 .

[15]  Mark W. Davis,et al.  QUILT: implementing a large-scale cross-language text retrieval system , 1997, SIGIR '97.

[16]  W. Bruce Croft,et al.  Phrasal translation and query expansion techniques for cross-language information retrieval , 1997, SIGIR '97.

[17]  Mark W. Davis,et al.  Document Thumbnail Visualization for Rapid Relevance Judgments: When do They Pay Off? , 1998, TREC.