Local search service (e.g. Yelp, Yahoo! Local) has emerged as a popular and effective paradigm for a wide range of information needs for local businesses; it now provides a viable and even more effective alternative to general purpose web search for queries on local businesses. However, due to the diversity of information needs behind local search, it is necessary to use different information retrieval strategies for different query types in local search. In this paper, we explore a taxonomy of local search driven by users' information needs, which categorizes local search queries into three types: business category, chain business, and non-chain business. To decide which search strategy to use for each category in this taxonomy without placing the burden on the web users, it is indispensable to build an automatic local query classifier. However, since local search queries yield few online features and it is expensive to obtain editorial labels, it is insufficient to use only a supervised learning approach. In this paper, we address these problems by developing a semi-supervised approach for mining information needs from a vast amount of unlabeled data from local query logs to boost local query classification. Results of a large scale evaluation over queries from a commercial local search site illustrate that the proposed semi-supervised method allow us to accurately classify a substantially larger proportion of local queries than the supervised learning approach.
[1]
Andrei Z. Broder,et al.
Robust classification of rare queries using web knowledge
,
2007,
SIGIR.
[2]
Ophir Frieder,et al.
Automatic web query classification using labeled and unlabeled training data
,
2005,
SIGIR '05.
[3]
Ian H. Witten,et al.
The WEKA data mining software: an update
,
2009,
SKDD.
[4]
In-Ho Kang,et al.
Query type classification for web document retrieval
,
2003,
SIGIR.
[5]
Andrei Broder,et al.
A taxonomy of web search
,
2002,
SIGF.
[6]
Ophir Frieder,et al.
Improving automatic query classification via semi-supervised learning
,
2005,
Fifth IEEE International Conference on Data Mining (ICDM'05).
[7]
Ophir Frieder,et al.
Hourly analysis of a very large topically categorized web query log
,
2004,
SIGIR '04.
[8]
Luis Gravano,et al.
Categorizing web queries according to geographical locality
,
2003,
CIKM '03.