Web query translation via web log mining

This paper describes a method to automatically acquire query translation pairs by mining web click-through data. The extraction requires no crawling or Chinese words segmentation, and can capture popular translations. Experimental results on a real click-through data show that only 17.4% of the extracted queries are in the dictionary, and our method can achieve 62.2% (in top-1) to 80.0% (in top-5) precision in translating web queries. Moreover, the extracted translations are semantically relevant to the source query, which is particularly useful for Cross-Lingual Information Retrieval (CLIR).