Sentence Compression as Deletion with Contextual Embeddings

Sentence compression is the task of creating a shorter version of an input sentence while keeping important information. In this paper, we extend the task of compression by deletion with the use of contextual embeddings. Different from prior work usually using non-contextual embeddings (Glove or Word2Vec), we exploit contextual embeddings that enable our model capturing the context of inputs. More precisely, we utilize contextual embeddings stacked by bidirectional Long-short Term Memory and Conditional Random Fields for dealing with sequence labeling. Experimental results on a benchmark Google dataset show that by utilizing contextual embeddings, our model achieves a new state-of-the-art F-score compared to strong methods reported on the leader board.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Jieping Ye,et al.  Automatic Dialogue Summary Generation for Customer Service , 2019, KDD.

[3]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[4]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[5]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6]  Dan Klein,et al.  Jointly Learning to Extract and Compress , 2011, ACL.

[7]  Mark Last,et al.  An unsupervised constrained optimization approach to compressive summarization , 2020, Inf. Sci..

[8]  Yang Zhao,et al.  A Language Model based Evaluator for Sentence Compression , 2018, ACL.

[9]  J. Clarke,et al.  Global inference for sentence compression : an integer linear programming approach , 2008, J. Artif. Intell. Res..

[10]  Le Tien Dung,et al.  Transfer Learning for Information Extraction with Limited Data , 2019, PACLING.

[11]  Roland Vollgraf,et al.  Contextual String Embeddings for Sequence Labeling , 2018, COLING.

[12]  Sophia Ananiadou,et al.  A Neural Layered Model for Nested Named Entity Recognition , 2018, NAACL.

[13]  Yasemin Altun,et al.  Overcoming the Lack of Parallel Data in Sentence Compression , 2013, EMNLP.

[14]  Lejian Liao,et al.  Can Syntax Help? Improving an LSTM-based Sentence Compression Model for New Domains , 2017, ACL.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.