NACSIS Test Collection Workshop ( NTCIR-1 )

The test collection used in the Workshop consists of more than 330,000 documents and more than half are English-Japanese paired. Although there is a Japanese test collection called BMIRJ2 consisting of 5,080 newspaper articles[2], enhancement of the Japanese test collection in the both aspects of the variety of text types and the scale is needed. We put emphasis on cross-lingual retrieval since it is criti cal in the internet environment and Japanese scientific information retrieval [3].