Large-scale Evaluation Infrastructure for Information Access Technologies Enhancement and Creativity

This paper introduces the NTCIR (NII Test Collection for Information Retrieval and access technologies) Project and its evaluation workshop series called NTCIR Workshops, which are designed to enhance research in information access technologies, such as information retrieval, text summarization, question answering, text mining, and their cross-lingual efforts by providing infrastructure for large-scale evaluations. With prosperity of the Internet, information access technologies have become one of the very fundamental social infrastructures and its importance has been increased tremendously. Research and development of information access technologies require solid evidence based on experiments to show the superiority of the newly proposed system and/or strategy over previous ones. NTCIR has provided large-scale infrastructures usable for such testing and evaluation. Conducting meaningful and reliable large-scale evaluation is not easy - The success criteria for information access are human judgments of "relevance", which are not consistent across assessors and over time. Difficulty of the information access varies according to the document sets, users' search requests, and users' tasks or situation. Under such circumstances, how to perform more reliable, stable and sensible evaluation are always challenging. We have cumulated insights on these matters working together with participants and other evaluation projects of the world like TREC, CLEF and DUC. We hope that the NTCIR Workshops serve for those areas and that wide-ranging insights have been produced as results To conclude, some thoughts on future directions are suggested.