THUIR at NTCIR-9 INTENT Task

This is the first year IR group of Tsinghua University (THUIR) participates in NTCIR. We register the INTENT task and focus on the Chinese topics of subtopic mining and document ranking subtask. In our experiments, we try to mine subtopics from different resources, namely query recommendation, Wikipedia and the query-URL bipartite graph which is constructed by clickthrough data. We also develop some methods to re-rank the subtopics and remove reduplicate ones with query log and search result snippets in search engines. In the document ranking task, methods applied to diversify English documents are used to validate their effectiveness on Chinese pages, such as HITS, Novelty-Result Selection and Documents Duplication Elimination. Based on the new metric, called D#-nDCG, we propose a DocumentDiversification algorithm to select the documents retrieved for subtopics mined in the subtopic mining task, and user browse logs are also leveraged to re-rank these selected results.