Using Parallel Web Pages for Multi-lingual IR

In this report, we describe the approach we used in CLEF Cross-Language IR (CLIR) tasks. In our experiments, we used statistical models estimated from parallel texts automatically mined from the Web. In our previous experiments, we tested CLIR for English-French and English-Chinese. Our goal of this series of experiments is to see if the approach may be extended to multi-lingual IR (with other languages). In particular, we compare models trained from the Web documents with models that also combine other resources such as dictionaries.