In this research, we investigate a methodology to classify automatically Web queries by topic and user intent. Taking a 20,000 plus Web query data set sectioned by topic, we manually classified each query using a three-level hierarchy of user intent. We note that significant differences in user intent across topics. Results show that user intent (informational, navigational, and transactional) varies by topic (15 to 24 percent depending on the category). We then use this manually classified data set to classify searches in a Web search engine query stream automatically, using an exact match followed by n-gram approach. These approaches have the advantage of being implementable in real time for query classification of Web searches. The implications are that a search engine can improve retrieval performance by more effectively identifying the intent underlying user queries.
[1]
Daniel E. Rose,et al.
Understanding user goals in web search
,
2004,
WWW '04.
[2]
Amanda Spink,et al.
Determining the informational, navigational, and transactional intent of Web queries
,
2008,
Inf. Process. Manag..
[3]
Ophir Frieder,et al.
Varying approaches to topical web query classification
,
2007,
SIGIR.
[4]
Ophir Frieder,et al.
Automatic classification of Web queries using very large unlabeled query logs
,
2007,
TOIS.
[5]
Qiang Yang,et al.
Query enrichment for web-query classification
,
2006,
TOIS.
[6]
Pamela Effrein Sandstrom,et al.
Information Foraging Theory: Adaptive Interaction with Information
,
2010,
J. Assoc. Inf. Sci. Technol..
[7]
Andrei Broder,et al.
A taxonomy of web search
,
2002,
SIGF.