Information retrieval in falktales using natural language processing

Our aim is to extract information about literary characters in unstructured texts. We employ natural language processing and reasoning on domain ontologies. The first task is to identify the main characters and the parts of the story where these characters are described or act. We illustrate the system in a scenario in the folktale domain. The system relies on a folktale ontology that we have developed based Propp's model for folktales morphology.

[1]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[2]  R. Raymond Lang,et al.  A Declarative Model for Simple Narratives , 1999 .

[3]  Kathleen McKeown,et al.  Extracting Social Networks from Literary Fiction , 2010, ACL.

[4]  Son Doan,et al.  Application of information technology: MedEx: a medication information extraction system for clinical narratives , 2010, J. Am. Medical Informatics Assoc..

[5]  Kalina Bontcheva,et al.  Evolving GATE to meet new challenges in language engineering , 2004, Natural Language Engineering.

[6]  Owen Rambow,et al.  Social Network Analysis of Alice in Wonderland , 2012, CLfL@NAACL-HLT.

[7]  Benedikt Löwe,et al.  Annotating with Propp's Morphology of the Folktale: reproducibility and trainability , 2014, Lit. Linguistic Comput..

[8]  Vladimir Propp,et al.  Morphology of the folktale , 1959 .

[9]  David Bamman,et al.  A Bayesian Mixed Effects Model of Literary Character , 2014, ACL.

[10]  Denilson Barbosa,et al.  Identification of Speakers in Novels , 2013, ACL.

[11]  Adrian Groza,et al.  Integrating DBpedia and SentiWordNet for a tourism recommender system , 2011, 2011 IEEE 7th International Conference on Intelligent Computer Communication and Processing.

[12]  Adrian Groza,et al.  Interleaving ontology-based reasoning and Natural Language Processing for character identification in folktales , 2014, 2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP).

[13]  Nils Reiter,et al.  An NLP-based cross-document approach to narrative structure discovery , 2014, Lit. Linguistic Comput..

[14]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[15]  Sung-Hwan Kim,et al.  Structural Analysis on Social Network Constructed from Characters in Literature Texts , 2013, J. Comput..

[16]  Apoorv Agarwal,et al.  Parsing Screenplays for Extracting Social Networks from Movies , 2014, CLfL@EACL.

[17]  Maarten Marx,et al.  Named Entity Recognition and Resolution for Literary Studies , 2014, CLIN 2014.

[18]  Colin Phillips,et al.  Differential effects of constraints in the processing of Russian cataphora , 2010, Quarterly journal of experimental psychology.

[19]  Sean Bechhofer,et al.  The OWL API: A Java API for OWL ontologies , 2011, Semantic Web.