Auto-tagging of images in non-english languages using tag language conversion

Utilization of web images with social tags as training data has been a major trend for the development of automatic image tagging/classification systems. While the amount of information available on web sites such as Flickr is abundant, the majority of information obtained from such sites is in English, the dominant language on the web. This linguistic unbalance is expected to affect auto-tagging results in a negative way for non-English users, who demand image tags in their native languages. The objective of this research is to develop an image auto-tagging system which can generate tags in languages other than English. This paper examines the effect of linguistic unbalance in training data to construct auto-tagging systems, which aim to generate tags in minor languages. Furthermore, we propose methods which utilize an auto-tagging model generated on English training data, and convert the auto-tagging results to the target language. Subjective evaluations show that the proposed method is capable of generating auto-tagging results with better quality than conventional approaches.