Exploring global sentence representation for graph-based dependency parsing using BLSTM-SCNN

Abstract Deep Learning has been widely applied for dependency parsing in recent years. In this paper, we propose an effective deep neural network model for graph-based dependency parsing. In our model, first, a special feature extraction layer is elaborately designed by combining the bidirectional Long Short-Term Memory (BLSTM) and the segment-based Convolutional Neural Network (SCNN), which is able to capture rich contextual information of the sentence for parsing. Then, the features learnt in feature extraction layer are fed into the standard feed-forward network, which is trained with max-margin criteria and makes predictions for dependency labels. Finally, to search the best dependency structure for the sentence from the dependency graph, the classical dynamic programming algorithm is used. In our experiment, we test the proposed model on 14 different languages including English, Chinese and German, whose results show that the proposed model achieves competitive accuracies in unlabeled attachment scores and labeled attachment scores compared with state-of-the-art dependency parsers. What's more, the model shows better ability in recovering long-distance dependencies compared with common neural network models.

[1]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[2]  Regina Barzilay,et al.  Greed is Good if Randomized: New Inference for Dependency Parsing , 2014, EMNLP.

[3]  Slav Petrov,et al.  Structured Training for Neural Network Transition-Based Parsing , 2015, ACL.

[4]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[5]  Joakim Nivre,et al.  Incrementality in Deterministic Dependency Parsing , 2004 .

[6]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[7]  Noah A. Smith,et al.  Dual Decomposition with Many Overlapping Components , 2011, EMNLP.

[8]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[9]  Jianfeng Gao,et al.  Bi-directional Attention with Agreement for Dependency Parsing , 2016, EMNLP.

[10]  Hao Zhang,et al.  Enforcing Structural Diversity in Cube-pruned Dependency Parsing , 2014, ACL.

[11]  Jie Hao,et al.  Local Translation Prediction with Global Sentence Representation , 2015, IJCAI.

[12]  Hai Zhao,et al.  Probabilistic Graph-based Dependency Parsing with Convolutional Neural Network , 2016, ACL.

[13]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[14]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[15]  Noah A. Smith,et al.  Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser , 2016, EMNLP.

[16]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[17]  Baobao Chang,et al.  An Effective Neural Network Model for Graph-based Dependency Parsing , 2015, ACL.

[18]  Nianwen Xue,et al.  Building a Large-Scale Annotated Chinese Corpus , 2002, COLING.

[19]  Eduard H. Hovy,et al.  Efficient Inner-to-outer Greedy Algorithm for Higher-order Labeled Dependency Parsing , 2015, EMNLP.

[20]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[21]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[22]  Baobao Chang,et al.  Graph-based Dependency Parsing with Bidirectional LSTM , 2016, ACL.

[23]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[24]  Jason Eisner Bilexical Grammars and a Cubic-time Probabilistic Parser , 1997, IWPT.

[25]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[26]  Dong Wang,et al.  Relation Classification via Recurrent Neural Network , 2015, ArXiv.

[27]  Hao Zhang,et al.  Online Learning for Inexact Hypergraph Search , 2013, EMNLP.

[28]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[29]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[30]  Hai Zhao,et al.  Fourth-Order Dependency Parsing , 2012, COLING.

[31]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[32]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[33]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[34]  Ohad Shamir,et al.  Better Mini-Batch Algorithms via Accelerated Gradient Methods , 2011, NIPS.

[35]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.