A novel approach for cross language information retrieval

Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. The domain of CLIR is crucial in the future as vast amount of content in the web is in English. There is a need for mechanisms that can retrieve the content from English and translate it to the native language. This paper proposes a text summarization-suffix tree algorithm for the overall process of CLIR. The idea is to summarize the content and re-rank the results based on clustered coefficients. This method can improve the efficiency of the content retrieved and afford a great flexibility to the user. The key contributions of this paper are to explain the approach and discuss the key results.