Enhanced Text Matching Based on Semantic Transformation

Text matching is the core of natural language processing (NLP) system. It’s considered as a touchstone of the NLP, and it aims to find whether text pairs are equal in semantics. However, the semantic gap in text matching is still an open problem to solve. Inspired by successes of cycle-consistent adversarial network (CycleGAN) in image domain transformation, we propose an enhanced text matching method based on the CycleGAN combined with the Transformer network. Based on the proposed method, the text semantics in a source domain is transferred to a similar or different target domain, and the semantic distance between text pairs is decreased. Meanwhile, we demonstrate our method in paraphrase identification and question answer matching. The matching degree is computed by a standard text matching model to evaluate the transforming influence on narrowing the text semantic gap. The experiments show that our method achieves text domain adaptation, and the effects on different matching models are remarkable.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  Forrest N. Iandola,et al.  DenseNet: Implementing Efficient ConvNet Descriptor Pyramids , 2014, ArXiv.

[3]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[4]  Hazim Kemal Ekenel,et al.  Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Tingting He,et al.  Learning semantic representation with neural networks for community question answering retrieval , 2016, Knowl. Based Syst..

[6]  Xin Liu,et al.  LCQMC:A Large-scale Chinese Question Matching Corpus , 2018, COLING.

[7]  Yang Song,et al.  P-CNN: Enhancing text matching with positional convolutional neural network , 2019, Knowl. Based Syst..

[8]  Madian Khabsa,et al.  Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching , 2018, AAAI.

[9]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[10]  Xiaohua Zhai,et al.  The GAN Landscape: Losses, Architectures, Regularization, and Normalization , 2018, ArXiv.

[11]  Ngoc Thang Vu,et al.  Combining Recurrent and Convolutional Neural Networks for Relation Classification , 2016, NAACL.

[12]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[13]  Rabab Kreidieh Ward,et al.  Semantic Modelling with Long-Short-Term Memory for Information Retrieval , 2014, ArXiv.

[14]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[15]  Yu-Wing Tai,et al.  Landmark Assisted CycleGAN for Cartoon Face Generation , 2019, ArXiv.

[16]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[17]  Zhiyuan Liu,et al.  A C-LSTM Neural Network for Text Classification , 2015, ArXiv.

[18]  Luciano da Fontoura Costa,et al.  Concentric network symmetry grasps authors' styles in word adjacency networks , 2015, ArXiv.

[19]  Nagiza F. Samatova,et al.  A Hybrid CNN-RNN Alignment Model for Phrase-Aware Sentence Classification , 2017, EACL.

[20]  Hinrich Schütze,et al.  Erratum: “ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs” , 2016, Transactions of the Association for Computational Linguistics.

[21]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[22]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  Roger Wattenhofer,et al.  Symbolic Music Genre Transfer with CycleGAN , 2018, 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI).

[25]  Rico Sennrich,et al.  Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.

[26]  Hang Li,et al.  Semantic Matching in Search , 2014, SMIR@SIGIR.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yue Yu,et al.  SAR-to-Optical Image Translation Using Supervised Cycle-Consistent Adversarial Networks , 2019, IEEE Access.

[29]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[30]  Jacob Abernethy,et al.  On Convergence and Stability of GANs , 2018 .

[31]  Chris Quirk,et al.  Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[32]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[33]  Bowen Zhou,et al.  LSTM-based Deep Learning Models for non-factoid answer selection , 2015, ArXiv.

[34]  Eneko Agirre,et al.  SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.

[35]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[36]  Chi-Keung Tang,et al.  Attribute-Guided Face Generation Using Conditional CycleGAN , 2017, ECCV.

[37]  Dongyan Zhao,et al.  Question Answering on Freebase via Relation Extraction and Textual Evidence , 2016, ACL.

[38]  Fernanda Gusmão de Lima Kastensmidt,et al.  Evaluating one-hot encoding finite state machines for SEU reliability in SRAM-based FPGAs , 2006, 12th IEEE International On-Line Testing Symposium (IOLTS'06).

[39]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Sebastian Nowozin,et al.  Stabilizing Training of Generative Adversarial Networks through Regularization , 2017, NIPS.

[41]  Haitao Liu,et al.  Approaching human language with complex networks. , 2014, Physics of life reviews.

[42]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Diego R. Amancio,et al.  Text Authorship Identified Using the Dynamics of Word Co-Occurrence Networks , 2016, PloS one.

[44]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[45]  Luciano da Fontoura Costa,et al.  Using complex networks for text classification: Discriminating informative and imaginative documents , 2016 .

[46]  Wright-Patterson Afb,et al.  Feature Selection Using a Multilayer Perceptron , 1990 .

[47]  Yu Xu,et al.  Matching Natural Language Sentences with Hierarchical Sentence Factorization , 2018, WWW.

[48]  Qiong Zhang,et al.  Generating Handwritten Chinese Characters Using CycleGAN , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[49]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[50]  Jeremy Ma,et al.  Text-to-Image-to-Text Translation using Cycle Consistent Adversarial Networks , 2018, ArXiv.

[51]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[52]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[53]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[54]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.