Definition Extraction with LSTM Recurrent Neural Networks

Definition extraction is the task to identify definitional sentences automatically from unstructured text. The task can be used in the aspects of ontology generation, relation extraction and question answering. Previous methods use handcraft features generated from the dependency structure of a sentence. During this process, only part of the dependency structure is used to extract features, thus causing information loss. We model definition extraction as a supervised sequence classification task and propose a new way to automatically generate sentence features using a Long Short-Term Memory neural network model. Our method directly learns features from raw sentences and corresponding part-of-speech sequence, which makes full use of the whole sentence. We experiment on the Wikipedia benchmark dataset and obtain 91.2 % on \(F_1\) score which outperforms the current state-of-the-art methods by 5.8 %. We also show the effectiveness of our method in dealing with other languages by testing on a Chinese dataset and obtaining 85.7 % on \(F_1\) score.

[1]  Paola Velardi,et al.  Learning Word-Class Lattices for Definition and Hypernym Extraction , 2010, ACL.

[2]  Aldo Gangemi,et al.  The OntoWordNet Project: Extension and Axiomatization of Conceptual Relations in WordNet , 2003, OTM.

[3]  Luigi Di Caro,et al.  Extracting Definitions and Hypernym Relations relying on Syntactic Dependencies and Support Vector Machines , 2013, ACL.

[4]  Min-Yen Kan,et al.  Mining Scientific Terms and their Definitions: A Study of the ACL Anthology , 2013, EMNLP.

[5]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[6]  Smaranda Muresan,et al.  A Method for Automatically Building and Evaluating Dictionary Resources , 2002, LREC.

[7]  Ulrich Schäfer,et al.  Extracting glossary sentences from scholarly articles: A comparative evaluation of pattern bootstrapping and deep analysis , 2012, Discoveries@ACL.

[8]  Horacio Saggion,et al.  Weakly Supervised Definition Extraction , 2015, RANLP.

[9]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[10]  Smaranda Muresan,et al.  Evaluation of the DEFINDER system for fully automatic glossary construction , 2001, AMIA.

[11]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[12]  Gosse Bouma,et al.  Learning to Identify Definitions using Syntactic Features , 2006, Learning Structured Information@EACL.

[13]  E. N. Westerhout,et al.  Definition Extraction using Linguistic and Structural Features , 2009 .

[14]  Dragomir R. Radev,et al.  The ACL Anthology Reference Corpus: A Reference Dataset for Bibliographic Research in Computational Linguistics , 2008, LREC.

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Tat-Seng Chua,et al.  Soft pattern matching models for definitional question answering , 2007, TOIS.

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[19]  Smaranda Muresan,et al.  Evaluation of DEFINDER: a system to mine definitions from consumer-oriented medical text , 2001, JCDL '01.

[20]  Eline Westerhout,et al.  Extraction of Dutch definitory contexts for eLearning purposes , 2007 .

[21]  Horacio Saggion,et al.  Applying Dependency Relations to Definition Extraction , 2014, NLDB.

[22]  António Branco,et al.  Coping with highly imbalanced datasets: A case study with definition extraction in a multilingual setting , 2013, Natural Language Engineering.

[23]  Gordon J. Pace,et al.  Evolutionary Algorithms for Definition Extraction , 2009 .