论文信息 - A Comparison of Cooking Recipe Named Entities between Japanese and English

A Comparison of Cooking Recipe Named Entities between Japanese and English

In this paper, we analyze the structural differences between the instructional text in Japanese and English cooking recipes. First, we constructed an English recipe corpus of 100 recipes, designed to be comparable to an existing Japanese recipe corpus. We annotated recipe named entities (r-NEs) in the English corpus according to guidelines previously defined for Japanese. We trained a state-of-art NE recognizer, PWNER, on the English r-NEs, and achieved very similar accuracy and coverage to previous results for the Japanese corpus, thus demonstrating the quality and consistency of the annotations. Second, we compared the r-NEs annotated in the Japanese and English corpora, and uncovered lexical, semantic, and underlying structural differences between Japanese and English recipes. We discuss reasons for these differences, which have significant implications for cross-language retrieval and automatic translation of recipes.

Yoko Yamakata | John Carroll | Shinsuke Mori

[1] Pierre Zweigenbaum,et al. Medical Entity Recognition: A Comparaison of Semantic and Statistical Methods , 2011, BioNLP@ACL.

[2] Yoko Yamakata,et al. Flow Graph Corpus from Recipe Texts , 2014, LREC.

[3] Yoko Yamakata,et al. Definition of Recipe Terms and Corpus Annotation for their Automatic Recognition , 2015 .

[4] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[5] Yoko Yamakata,et al. Named Entity Recognizer Trainable from Partially Annotated Data , 2015, PACLING.

[6] Ralph Grishman,et al. A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[7] Dan Roth,et al. Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[8] Erik F. Tjong Kim Sang,et al. Representing Text Chunks , 1999, EACL.