Analyzing search engine queries for the use of geographic terms

Millions of Web users seek information every day for business or private purposes on topics which cover the whole spectrum of human knowledge and interest, ranging from the mundane to the highly esoteric. However, users in general appear unwilling to learn new techniques for improving the relevance of their searches. Research suggests that any such improvement must be based around the “intelligent” interpretation of user queries by the search engines with the minimum of user feedback. The understanding of how users actually use geographic terminology could play an important part in developing search engines capable of interpreting user queries. The aim of this study was to analyse the 2001 Excite query log to investigate the extent and variation of Web queries containing geographic terms. In particular it was an investigation into what people search for when they use geographic terms, the ways in which they describe a geographic location, the terminology used to find geographically related information and the structure of users’ queries when looking for geographically related information on the Web. This study also attempted to determine how geographically related queries differ from other queries. The results show that geographic terms form a significant part of the vocabulary of Web users and suggest that this is an area worth pursuing to improve the retrieval effectiveness of search engines. Geographically related queries formed nearly one fifth of all queries submitted to the Excite search engine, the terms occurring most frequently being place names. Geographic queries were also shown to be longer than average and the association of two or more terms within geographic queries was found to be high, thus suggesting that phrase analysis may be an important feature required of search engines. However, it was not possible to assess the effectiveness of the queries since there was no information available on what the user is actually looking for, nor on the results obtained.

[1]  Harith Alani,et al.  Ontology-driven geographical information retrieval , 2000 .

[2]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[3]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[4]  Peiling Wang,et al.  Mining longitudinal web queries: Trends and patterns , 2003, J. Assoc. Inf. Sci. Technol..

[5]  Harith Alani,et al.  Geographical Information Retrieval with Ontologies of Place , 2001, COSIT.

[6]  Amanda Spink,et al.  Methodological approach in discovering user search patterns through Web log analysis , 2005 .

[7]  R. Golledge,et al.  Spatial Behavior: A Geographic Perspective , 1996 .

[8]  Ray R. Larson,et al.  Geographic information retrieval and spatial browsing , 1996 .

[9]  Amanda Spink,et al.  Multimedia Web searching trends , 2002, ASIST.

[10]  Peter G. Anick Using terminological feedback for web search refinement: a log-based study , 2003, SIGIR.

[11]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[12]  Linda L. Hill,et al.  Core Elements of Digital Gazetteers: Placenames, Categories, and Footprints , 2000, ECDL.

[13]  Amanda Spink,et al.  From E-Sex to E-Commerce: Web Search Changes , 2002, Computer.

[14]  Bernard J. Jansen,et al.  A review of Web searching studies and a framework for future research , 2001, J. Assoc. Inf. Sci. Technol..

[15]  Luis Gravano,et al.  Exploiting Geographical Location Information of Web Pages , 1999, WebDB.

[16]  Yahiko Kambayashi,et al.  Models for Conceptual Geographical Prepositions Based on Web Resource , 2001 .

[17]  Amanda Spink,et al.  Linguistic Aspects of Web Queries. , 2000 .

[18]  Amanda Spink,et al.  Characteristics of question format web queries: an exploratory study , 2002, Inf. Process. Manag..

[19]  M. Goodchild,et al.  Geographic Information Systems and Science (second edition) , 2001 .