论文信息 - Monolingual sentence matching for text simplification - 字舞流文

Monolingual sentence matching for text simplification

This work improves monolingual sentence alignment for text simplification, specifically for text in standard and simple Wikipedia. We introduce a convolutional neural network structure to model similarity between two sentences. Due to the limitation of available parallel corpora, the model is trained in a semi-supervised way, by using the output of a knowledge-based high performance aligning system. We apply the resulting similarity score to rescore the knowledge-based output, and adapt the model by a small hand-aligned dataset. Experiments show that both rescoring and adaptation improve the performance of knowledge-based method.

Yunhui Li | Yonghui Huang | Yi Luan | Yi Luan | Yonghui Huang | Yunhui Li

[1] Pascale Fung,et al. Mining Very-Non-Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and E , 2004, EMNLP.

[2] Iryna Gurevych,et al. A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[3] Boyang Li,et al. Multiplicative Representations for Unsupervised Semantic Role Induction , 2016, ACL.

[4] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5] Jeffrey Pennington,et al. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[6] Wei Wu,et al. Aligning Sentences from Standard Wikipedia to Simple Wikipedia , 2015, NAACL.

[7] Larry P. Heck,et al. Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[8] Mari Ostendorf,et al. Relating automatic vowel space estimates to talker intelligibility , 2014, INTERSPEECH.

[9] Rada Mihalcea,et al. Text-to-Text Semantic Similarity for Automatic Short Answer Grading , 2009, EACL.

[10] Mari Ostendorf,et al. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction , 2018, EMNLP.

[11] Christopher D. Manning,et al. Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[12] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[14] Mari Ostendorf,et al. The UWNLP system at SemEval-2018 Task 7: Neural Relation Extraction Model with Selectively Incorporated Concept Embeddings , 2018, SemEval@NAACL-HLT.

[15] Hang Li,et al. Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[16] Jianfeng Gao,et al. Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models , 2017, IJCNLP.

[17] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[18] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[19] Cristian Danescu-Niculescu-Mizil,et al. For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia , 2010, NAACL.

[20] Shinji Watanabe,et al. Efficient learning for spoken language understanding tasks with word embedding based pre-training , 2015, INTERSPEECH.

[21] Mari Ostendorf,et al. Scientific Information Extraction with Semi-supervised Neural Tagging , 2017, EMNLP.

[22] Mari Ostendorf,et al. LSTM based Conversation Models , 2016, ArXiv.

[23] Chris Callison-Burch,et al. Bootstrapping Parallel Corpora , 2003, ParallelTexts@NAACL-HLT.

[24] Yelong Shen,et al. A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.