Explorations in tag suggestion and query expansion

The query used in a search system is only an approximation to the user's true information need, and as a result, many factors can reduce the quality of search results. One is query ambiguity, causing searchers with different needs to issue the same query. For example, for the query java, some users may want to find java tutorial while others may want to download java software. Other factors include a vocabulary mismatch and a lack of knowledge regarding the contents of the document collection. In any case, many users benefit from assistance in forming a good query. As a result, some commercial services provide query suggestions for many queries. In this paper, we propose a Tag Suggestion System that takes advantage of tags associated with query results to expand a searcher's query. Since not every web page is associated with existing tags, we first build an auto-tagging system which can assign multiple tags to web pages, including news, blogs, etc. The current system contains the most popular 140 tags in del.icio.us, with high precision performance. A small user study is performed to evaluate anecdotally the performance of our Tag Suggestion System, showing better quality than the query suggestion mechanisms provided by Yahoo! and Google. The result pages of expanded queries generated by the Tag Suggestion System are also significantly better than those of the Google original system.