论文信息 - Osaka Kyoiku University at NTCIR-10 CrossLink-2: Link Filtering by Title Tag of Corpus as a Dictionary

Osaka Kyoiku University at NTCIR-10 CrossLink-2: Link Filtering by Title Tag of Corpus as a Dictionary

Our group (OKSAT) submitted two types of runs named SMP and REF for every subtasks of NTCIR-10 Cross-lingual Link Discovery (CLLD). Our method uses titles in Wikipedia pages (corpus) of source language as a entries of a dictionary, so no external dictionary is required. For SMP, we aimed to discover cross-lingual links of actual Wikipedia, in other words it targets Wikipedia ground truth. For REF, on the other hand, we aimed to discover as much meaningful cross-lingual links as possible automatically.

Takashi Sato

[1] Takashi Sato,et al. NTCIR-3 PAT Experiments at Osaka Kyoiku University: Long Gram-based Index and Essential Words , 2002, NTCIR.

[2] D. Huffman. A Method for the Construction of Minimum-Redundancy Codes , 1952 .

[3] Stephen E. Robertson,et al. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[4] Takashi Sato,et al. NTCIR-3 CLIR Experiments at Osaka Kyoiku University - Comparison of Gram-based Indices , 2002, NTCIR.

[5] Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, NTCIR-10, National Center of Sciences, Tokyo, Japan, June 18-21, 2013 , 2013, NTCIR.

[6] Andrew Trotman,et al. Overview of the NTCIR-10 Cross-Lingual Link Discovery Task , 2013, NTCIR.