论文信息 - HCRL at NTCIR-10 MedNLP Task

HCRL at NTCIR-10 MedNLP Task

This year’s MedNLP[1] has two tasks: de-identification and complaint and diagnosis. We tested both machine learning based methods and an ad-hoc rule-based method for the two tasks. For the de-identification task, the rule-based method got slightly higher results, while for the complaint and diagnosis task, the machine learning based method had much higher recalls and overall scores. These results suggest that these methods should be applied selectively depending on the nature of the information to be extracted, that is to say, whether it can be simply patternized or not.

Osamu Imaichi | Yoshiki Niwa | Toshihiko Yanase

[1] Tomoko Ohkuma,et al. Overview of the NTCIR-10 MedNLP Task , 2013, NTCIR.

[2] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[3] Yuji Matsumoto,et al. Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[4] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.