Ambiguity of Queries and the Challenges for Query Language Detection

In this paper, a sample set of 510 simple searches from the TEL action log 2009 is analyzed for query content and query language. More than half of the queries are for named entities, which has consequences for query language disambiguation. A manual identification of query language finds that often a definite language cannot be determined, because many named entities are not translated. Problems and challenges for query category and language identification are discussed. Further analysis shows that IP address and interface language are not very strong indicators for determining the query language.

[1]  Yan Xu,et al.  A Search Engine based on Query Logs and Search Log Analysis at the University of Sunderland , 2009, CLEF.

[2]  Luca Dini,et al.  CACAO Project at the LogCLEF Track , 2009, CLEF.

[3]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[4]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[5]  Dong Zhou,et al.  TCD-DCU at LogCLEF 2009: An Analysis of Queries, Actions, and Interface Languages , 2009, CLEF.

[6]  Ralph Kölle,et al.  Search Path Visualization and Session Performance Evaluation with Log Files from The European Library (TEL) , 2009, CLEF.

[7]  SpinkAmanda,et al.  Real life information retrieval: a study of user queries on the Web , 1998 .

[8]  Amanda Spink,et al.  From E-Sex to E-Commerce: Web Search Changes , 2002, Computer.

[9]  Ricardo A. Baeza-Yates,et al.  The Intention Behind Web Queries , 2006, SPIRE.

[10]  Bernard J. Jansen,et al.  A review of Web searching studies and a framework for future research , 2001, J. Assoc. Inf. Sci. Technol..

[11]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[12]  Giorgio Maria Di Nunzio,et al.  LogCLEF 2009: the CLEF 2009 Multilingual Logfile Analysis Track Overview , 2009, CLEF.

[13]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[14]  Katja Hofmann,et al.  A Semantic Perspective on Query Log Analysis , 2009, CLEF.

[15]  Sally Jo Cunningham,et al.  An Analysis of Usage of a Digital Library , 1998, ECDL.