Partial semantic parsing of sentences by means of grammatically augmented ontology and weighted affix context-free grammar

In spite of the fact that modern statistical and neural net based tools for parsing natural language texts supersede classical approaches there are still areas where generative grammars are used. These are areas where collection of universal parallel corpuses is still in the progress. National sign languages are among them. Ontologies and common sense databases play valuable role in parsing and translation of such languages. Grammatically augmented ontology (GAO) is an ontology extension that links phrases to their meaning. The link is established via special expressions that connect phrase meaning to grammatical and semantical attributes of words that constitute it. The article introduces a new approach to sentence parsing that is based on integration of ontology relations into productions of weighted affix context-free grammar (WACFG). For that reason a new parser for WACFG grammar was developed inspired by works of C.H.A. Koster. Basic properties of WACFG are discussed and the algorithm for selection and convertion of GAO expressions into the set of WACFG productions is provided. The proposed algorithm turned out to be feasible in the context of parsing and translating Ukrainian Spoken and Ukrainian Sign language. The developed approach for mixed semantical and syntactical sentence parsing was tested on the database of sentences from Ukrainian fairy tail by Ivan Franko “Fox Mykyta” where 92 % of sentences were correctly parsed.

[1]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[2]  Ilyas Cicekli,et al.  An Ontology-Based Approach to Parsing Turkish Sentences , 1998, AMTA.

[3]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[4]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[5]  O. Lozynska,et al.  Spoken and sign language processing using grammatically augmented ontology , 2015 .

[6]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[7]  Oleksandr Marchenko,et al.  Determining Semantic Valences of Ontology Concepts by Means of Nonnegative Factorization of Tensors of Large Text Corpora , 2014 .

[8]  Dallin D. Oaks Structural Ambiguity in English: An Applied Grammatical Inventory , 2010 .

[9]  Francis Jeffry Pelletier,et al.  Representation and Inference for Natural Language: A First Course in Computational Semantics , 2005, Computational Linguistics.

[10]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[11]  Alistair A. Young,et al.  Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2017, MICCAI 2017.

[12]  Faten Kharbat A New Architecture for Translation Engine Using Ontology: One Step Ahead , 2011 .

[13]  M. V. Davydov A probabilistic search algorithm for finding suboptimal branchings in mutually exclusive hypothesis graph , 2014, Int. J. Knowl. Based Intell. Eng. Syst..

[14]  Oleksandr Marchenko,et al.  Development of a Semantic and Syntactic Model of Natural Language by Means of Non-negative Matrix and Tensor Factorization , 2014, TSD.

[15]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[16]  Amit P. Sheth,et al.  Altering document term vectors for classification: ontologies as expectations of co-occurrence , 2007, WWW '07.

[17]  Sang Keun Rhee,et al.  Ontology-based Semantic Relevance Measure , 2007, SWW 2.0.

[18]  Cornelis H. A. Koster Affix Grammars for Natural Languages , 1991, Attribute Grammars, Applications and Systems.

[19]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[20]  Philipp Cimiano,et al.  Generating LTAG grammars from a lexicon/ontology interface , 2010, TAG.