A Comparison of Cooking Recipe Named Entities between Japanese and English

In this paper, we analyze the structural differences between the instructional text in Japanese and English cooking recipes. First, we constructed an English recipe corpus of 100 recipes, designed to be comparable to an existing Japanese recipe corpus. We annotated recipe named entities (r-NEs) in the English corpus according to guidelines previously defined for Japanese. We trained a state-of-art NE recognizer, PWNER, on the English r-NEs, and achieved very similar accuracy and coverage to previous results for the Japanese corpus, thus demonstrating the quality and consistency of the annotations. Second, we compared the r-NEs annotated in the Japanese and English corpora, and uncovered lexical, semantic, and underlying structural differences between Japanese and English recipes. We discuss reasons for these differences, which have significant implications for cross-language retrieval and automatic translation of recipes.