Entity Highlight Generation as Statistical and Neural Machine Translation

Entity highlight refers to a short, concise, and characteristic description for an entity, which can be applied to various applications. In this article, we study the problem of automatically generating entity highlights from the descriptive sentences of entities. Specifically, we develop two computational approaches, one is inspired by the statistical machine translation (SMT) and another is a sequence-to-sequence learning (Seq2Seq) approach, which has been successfully applied in neural machine translation and neural summarization. In the Seq2Seq approach, we use attention mechanism, copy mechanism, and coverage mechanism. To generate entity-specific highlights, we also incorporate entity name into the Seq2Seq model to guide the decoding process. We automatically collect large-scale instances as training data without any manual annotation, and ask annotators to create a test set. We compare with several strong baseline methods, and evaluate the approaches with both automatic evaluation and manual evaluation. Experimental results show that the entity enhanced Seq2Seq model with attention, copy, and coverage mechanisms significantly outperforms all other approaches in terms of multiple evaluation metrics.1

[1]  Haifeng Wang,et al.  Generating Recommendation Evidence Using Translation Model , 2016, IJCAI.

[2]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[3]  Wenjie Li,et al.  Joint Copying and Restricted Generation for Paraphrase , 2016, AAAI.

[4]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[5]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[6]  Daraksha Parveen,et al.  Topical Coherence for Graph-based Extractive Summarization , 2015, EMNLP.

[7]  Rada Mihalcea,et al.  Language Independent Extractive Summarization , 2005, ACL.

[8]  Yang Li,et al.  Mining evidences for named entity disambiguation , 2013, KDD.

[9]  Tadashi Nomoto,et al.  Discriminative sentence compression with conditional random fields , 2007, Inf. Process. Manag..

[10]  Philipp Koehn,et al.  (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[11]  Wanxiang Che,et al.  Sentence Compression for Aspect-Based Sentiment Analysis , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[12]  Zhiguo Wang,et al.  Coverage Embedding Models for Neural Machine Translation , 2016, EMNLP.

[13]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[14]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[15]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[16]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[17]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[18]  Lluís Màrquez i Villodre,et al.  SVMTool: A general POS Tagger Generator Based on Support Vector Machines , 2004, LREC.

[19]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[20]  Naomie Salim,et al.  A review on abstractive summarization methods , 2014 .

[21]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[22]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[24]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[25]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[26]  Guy Lapalme,et al.  Text Generation for Abstractive Summarization , 2010, TAC.

[27]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[28]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[29]  Kathleen McKeown,et al.  Lexicalized Markov Grammars for Sentence Compression , 2007, NAACL.

[30]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[31]  Vikas Sindhwani,et al.  Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria , 2009, HLT-NAACL 2009.

[32]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[33]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[34]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[35]  Ming Zhou,et al.  Selective Encoding for Abstractive Sentence Summarization , 2017, ACL.

[36]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[37]  Dipanjan Das Andr,et al.  A Survey on Automatic Text Summarization , 2007 .

[38]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[39]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[40]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[41]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[42]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[43]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[44]  Haifeng Wang,et al.  Learning to Recommend Related Entities With Serendipity for Web Search Users , 2018, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[45]  Fernando Pereira,et al.  Multilingual Dependency Analysis with a Two-Stage Discriminative Parser , 2006, CoNLL.

[46]  Hayato Kobayashi,et al.  Summarization Based on Embedding Distributions , 2015, EMNLP.

[47]  Ting Liu,et al.  Application-driven Statistical Paraphrase Generation , 2009, ACL.

[48]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[49]  Wei Zhang,et al.  Learning to Explain Entity Relationships by Pairwise Ranking with Convolutional Neural Networks , 2017, IJCAI.

[50]  Qun Liu,et al.  Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation , 2017, ACL.

[51]  Eugene Charniak,et al.  Supervised and Unsupervised Learning for Sentence Compression , 2005, ACL.

[52]  Jiawei Han,et al.  On building entity recommender systems using user click log and freebase knowledge , 2014, WSDM.

[53]  Mirella Lapata,et al.  Automatic Generation of Story Highlights , 2010, ACL.

[54]  Andrew McCallum,et al.  Chinese Segmentation and New Word Detection using Conditional Random Fields , 2004, COLING.

[55]  Balaraman Ravindran,et al.  Diversity driven attention model for query-based abstractive summarization , 2017, ACL.