论文信息 - Extracting Frame-Like Structures from Google Books NGram Dataset

Extracting Frame-Like Structures from Google Books NGram Dataset

We propose a method that facilitates a process of semi-automatic FrameNet construction. The method requires Google Books NGram dataset and WordNet or another thesaurus for a particular language. We evaluated the method for Russian ngrams. Due to a huge amount of available data the method does not require sophisticated natural language processing techniques (e.g. for word sense disambiguation), and it shows a promising result.

Vladimir Ivanov | V. Ivanov

[1] Svetla Koeva,et al. Lexicon and Grammar in Bulgarian FrameNet , 2010, LREC.

[2] Slav Petrov,et al. Syntactic Annotations for the Google Books NGram Corpus , 2012, ACL.

[3] Emanuele Pianta,et al. Frame Information Transfer from English to Italian , 2008, LREC.

[4] Grigori Sidorov,et al. Rule-based System for Automatic Grammar Correction Using Syntactic N-grams for English Language Learning (L2) , 2013, CoNLL Shared Task.

[5] Natalia V. Loukachevitch,et al. RuThes Linguistic Ontology vs. Russian Wordnets , 2014, GWC.

[6] Lyashevskaya Olga. Dictionary of Valencies Meets Corpus Annotation: A Case of Russian FrameBank , 2012 .

[7] Daniel Gildea,et al. The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[8] Martha Palmer,et al. Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[9] Charles J. Fillmore,et al. The Structure of the Framenet Database , 2003 .

[10] Noé Alejandro Castro-Sánchez,et al. Analysis of Definitions of Verbs in an Explanatory Dictionary for Automatic Extraction of Actants Based on Detection of Patterns , 2010, NLDB.

[11] Birger Andersson,et al. Natural Language Processing and Information Systems , 2003, Lecture Notes in Computer Science.

[12] Efstathios Stamatatos,et al. Syntactic N-grams as machine learning features for natural language processing , 2014, Expert Syst. Appl..