Towards information extraction from ISR reports for decision support using a two-stage learning-based approach

The main challenge of computer linguistics is to represent the meaning of text in a computer model. Statistics based methods with manually created features have been used for more than 30 years with a divide and conquer approach to mark interesting features in free text. Around 2010, deep learning concepts found their way into the text-understanding research community. Deep learning is very attractive and easy to apply but needs massive pools of annotated and high quality data from every target domain, which is generally not available especially for the military domain. When changing the application domain one needs additional or new data to adopt the language models to the new domain. To overcome the everlasting “data problem” we chose a novel two-step approach by first using formal representations of the meaning and then applying a rule-based mapping to the target domain. As an intermediate language representation, we used abstract meaning representation (AMR) and trained a general base model. This base model was then trained with additional data from the intended domains (transfer learning) evaluating the quality of the parser with a stepwise approach in which we measured the parser performance against the amount of training data. This approach answered the question of how much data we need to get the required quality when changing an application domain. The mapping of the meaning representation to the target domain model gave us more control over specifics of the domain, which are not generally representable by a machine learning approach with self-learned feature vectors.

[1]  Sheng Zhang,et al.  Universal Decompositional Semantics on Universal Dependencies , 2016, EMNLP.

[2]  David R. Dowty Thematic proto-roles and argument selection , 1991 .

[3]  Kaarel Kaljurand,et al.  Attempto Controlled English for Knowledge Representation , 2008, Reasoning Web.

[4]  Ari Rappoport,et al.  Universal Conceptual Cognitive Annotation (UCCA) , 2013, ACL.

[5]  Yejin Choi,et al.  Neural AMR: Sequence-to-Sequence Models for Parsing and Generation , 2017, ACL.

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Valentina Dragos Developing a core ontology to improve military intelligence analysis , 2013, Int. J. Knowl. Based Intell. Eng. Syst..

[8]  Adam Pease,et al.  Towards a standard upper ontology , 2001, FOIS.

[9]  Kevin Knight,et al.  Smatch: an Evaluation Metric for Semantic Feature Structures , 2013, ACL.

[10]  David Sánchez,et al.  Ontology-driven web-based semantic similarity , 2010, Journal of Intelligent Information Systems.

[11]  Kersti Börjars,et al.  Non-Transformational Syntax: Formal and Explicit Models of Grammar , 2011 .

[12]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[13]  Danushka Bollegala,et al.  WebSim: A Web-based Semantic Similarity Measure , 2007 .

[14]  Jaime G. Carbonell,et al.  A Discriminative Graph-Based Parser for the Abstract Meaning Representation , 2014, ACL.

[15]  M. Steedman,et al.  Combinatory Categorial Grammar , 2011 .

[16]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[17]  Chuan Wang,et al.  A Transition-based Algorithm for AMR Parsing , 2015, NAACL.

[18]  Anna Freud,et al.  Grammatical Framework Programming With Multilingual Grammars , 2016 .

[19]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[20]  Joachim Biermann,et al.  From Unstructured to Structured Information in Military Intelligence - Some Steps to Improve Information Fusion , 2004 .

[21]  Jonathan J. Webster,et al.  Integration of Linguistic Resources for Verb Classification: FrameNet Frame, WordNet Verb and Suggested Upper Merged Ontology , 2007, CICLing.

[22]  Joakim Nivre,et al.  Incrementality in Deterministic Dependency Parsing , 2004 .

[23]  Giorgio Satta,et al.  An Incremental Parser for Abstract Meaning Representation , 2016, EACL.

[24]  Wilmuth Müller,et al.  Applying Knowledge-Based Reasoning for Information Fusion in Intelligence, Surveillance, and Reconnaissance , 2017, MFI 2017.