English Recipe Flow Graph Corpus

We present an annotated corpus of English cooking recipe procedures, and describe and evaluate computational methods for learning these annotations. The corpus consists of 300 recipes written by members of the public, which we have annotated with domain-specific linguistic and semantic structure. Each recipe is annotated with (1) `recipe named entities' (r-NEs) specific to the recipe domain, and (2) a flow graph representing in detail the sequencing of steps, and interactions between cooking tools, food ingredients and the products of intermediate steps. For these two kinds of annotations, inter-annotator agreement ranges from 82.3 to 90.5 F1, indicating that our annotation scheme is appropriate and consistent. We experiment with producing these annotations automatically. For r-NE tagging we train a deep neural network NER tool; to compute flow graphs we train a dependency-style parsing procedure which we apply to the entire sequence of r-NEs in a recipe.In evaluations, our systems achieve 71.1 to 87.5 F1, demonstrating that our annotation scheme is learnable.

[1]  Stefanie Tellex,et al.  Interpreting and Executing Recipes with a Cooking Robot , 2012, ISER.

[2]  Shinsuke Mori,et al.  A Framework for Procedural Text Understanding , 2015, IWPT.

[3]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[4]  Yoshio Momouchi,et al.  Control Structures for Actions in Procedural Texts and PT-Chart , 1980, COLING.

[5]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[6]  Erik F. Tjong Kim Sang,et al.  Representing Text Chunks , 1999, EACL.

[7]  Yoko Yamakata,et al.  Named Entity Recognizer Trainable from Partially Annotated Data , 2015, PACLING.

[8]  Yoko Yamakata,et al.  Feature Extraction and Summarization of Recipes Using Flow Graph , 2013, SocInfo.

[9]  Pierre Zweigenbaum,et al.  Medical Entity Recognition: A Comparaison of Semantic and Statistical Methods , 2011, BioNLP@ACL.

[10]  Yejin Choi,et al.  Mise en Place: Unsupervised Interpretation of Instructional Recipes , 2015, EMNLP.

[11]  Yoko Yamakata,et al.  Smart Kitchen: A User Centric Cooking Support System , 2008 .

[12]  Nizar Habash,et al.  Predicting the Structure of Cooking Recipes , 2015, EMNLP.

[13]  Jaime G. Carbonell,et al.  A Discriminative Graph-Based Parser for the Abstract Meaning Representation , 2014, ACL.

[14]  Ted Briscoe,et al.  The Second Release of the RASP System , 2006, ACL.

[15]  Ichiro Ide,et al.  Structural analysis of cooking preparation steps in Japanese , 2000, IRAL '00.

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Yoko Yamakata,et al.  A Comparison of Cooking Recipe Named Entities between Japanese and English , 2017, CEA@IJCAI.

[18]  Ralph Grishman,et al.  A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[19]  Yoko Yamakata,et al.  Definition of Recipe Terms and Corpus Annotation for their Automatic Recognition , 2015 .

[20]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[21]  Yoko Yamakata,et al.  Flow Graph Corpus from Recipe Texts , 2014, LREC.

[22]  Carlos Ramisch,et al.  Survey: Multiword Expression Processing: A Survey , 2017, CL.

[23]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.