Statistical Methods for Cross-Language Information Retrieval

Multi-lingual information retrieval (IR) has largely been limited to the development of multiple systems for use with a specific foreign language. The explosion in the availability of electronic media in languages other than English makes the development of IR systems that can cross language boundaries increasingly important. We are currently developing tools and techniques for Cross Language Information Retrieval. In this chapter, we present experiments that analyze the factors that affect dictionary based methods for cross-language retrieval and present methods that dramatically reduce the errors such an approach usually makes.