Lessons from BMIR-J2: a test collection for Japanese IR systems

BMIR-JP is the lirat complete Japanese test collection available for use in evaluating information retrieval systems. It contains sixty queries and the IDS of 5080 newspaper articles in the fields of economics and engineering. The queries are classified into five categories, based on the functions the system is likely to use to interpret them correctly and retrieve relevant texts. This collection has two levels of relevance, topically relevant and partially relevant. Also discussed are design issues such as collection types and size. This collection and the principles derived in designing it should be helpful in the future development of new test collections.