Condition Aware and Revise Transformer for Question Answering

The study of question answering has received increasing attention in recent years. This work focuses on providing an answer that compatible with both user intent and conditioning information corresponding to the question, such as delivery status and stock information in e-commerce. However, these conditions may be wrong or incomplete in real-world applications. Although existing question answering systems have considered the external information, such as categorical attributes and triples in knowledge base, they all assume that the external information is correct and complete. To alleviate the effect of defective condition values, this paper proposes condition aware and revise Transformer (CAR-Transformer). CAR-Transformer (1) revises each condition value based on the whole conversation and original conditions values, and (2) it encodes the revised conditions and utilizes the conditions embedding to select an answer. Experimental results on a real-world customer service dataset demonstrate that the CAR-Transformer can still select an appropriate reply when conditions corresponding to the question exist wrong or missing values, and substantially outperforms baseline models on automatic and human evaluations. The proposed CAR-Transformer can be extended to other NLP tasks which need to consider conditioning information.

[1]  X. Yao,et al.  Model-based kernel for efficient time series analysis , 2013, KDD.

[2]  Huanhuan Chen,et al.  Scalable Graph-Based Semi-Supervised Learning through Sparse Bayesian Model , 2017, IEEE Transactions on Knowledge and Data Engineering.

[3]  Huanhuan Chen,et al.  Learning in the Model Space for Cognitive Fault Diagnosis , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Xiaoyan Zhu,et al.  Assigning Personality/Profile to a Chatting Machine for Coherent Conversation Generation , 2018, IJCAI.

[5]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[6]  Jason Weston,et al.  Question Answering with Subgraph Embeddings , 2014, EMNLP.

[7]  Huanhuan Chen,et al.  Predictive Ensemble Pruning by Expectation Propagation , 2009, IEEE Transactions on Knowledge and Data Engineering.

[8]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[9]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[10]  Min Yang,et al.  Personalized Response Generation via Domain adaptation , 2017, SIGIR.

[11]  Huanhuan Chen,et al.  Sequential data classification by dynamic state warping , 2017, Knowledge and Information Systems.

[12]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[14]  Alexander M. Rush,et al.  Structured Attention Networks , 2017, ICLR.

[15]  Huanhuan Chen,et al.  Model Metric Co-Learning for Time Series Classification , 2015, IJCAI.

[16]  Bowen Zhou,et al.  Improved Neural Relation Detection for Knowledge Base Question Answering , 2017, ACL.

[17]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[18]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[19]  Huanhuan Chen,et al.  Model-Based Oversampling for Imbalanced Sequence Classification , 2016, CIKM.

[20]  Nanning Zheng,et al.  Knowledge Engineering With Big Data (BigKE): A 54-Month, 45-Million RMB, 15-Institution National Grand Project , 2017, IEEE Access.

[21]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[22]  David Konopnicki,et al.  Neural Response Generation for Customer Service based on Personality Traits , 2017, INLG.

[23]  Xiaoyan Zhu,et al.  Assigning personality/identity to a chatting machine for coherent conversation generation , 2017, ArXiv.

[24]  Huanhuan Chen,et al.  Probabilistic Classification Vector Machines , 2009, IEEE Transactions on Neural Networks.

[25]  Huanhuan Chen,et al.  Multiclass Probabilistic Classification Vector Machine , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Huanhuan Chen,et al.  Efficient Probabilistic Classification Vector Machine With Incremental Basis Function Selection , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[28]  David Vandyke,et al.  Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[29]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[30]  Boi Faltings,et al.  Personalization in Goal-Oriented Dialog , 2017, ArXiv.

[31]  Xu Sun,et al.  Learning Personalized End-to-End Goal-Oriented Dialog , 2018, AAAI.

[32]  Ming Zhou,et al.  Question Answering over Freebase with Multi-Column Convolutional Neural Networks , 2015, ACL.

[33]  Ting Liu,et al.  Neural personalized response generation as domain adaptation , 2017, World Wide Web.

[34]  Yue Wang,et al.  The APVA-TURBO Approach To Question Answering in Knowledge Base , 2018, COLING.

[35]  Huanhuan Chen,et al.  Cognitive fault diagnosis in Tennessee Eastman Process using learning in the model space , 2014, Comput. Chem. Eng..

[36]  Jianfeng Gao,et al.  Neural Approaches to Conversational AI: Question Answering, Task-oriented Dialogues and Social Chatbots , 2019 .

[37]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[38]  Huanhuan Chen,et al.  Multiobjective Learning in the Model Space for Time Series Classification , 2019, IEEE Transactions on Cybernetics.

[39]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[40]  Huanhuan Chen,et al.  Semisupervised Classification With Cluster Regularization , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Xindong Wu,et al.  Knowledge Engineering with Big Data , 2015, IEEE Intell. Syst..

[42]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[43]  Wei-Ying Ma,et al.  Topic Aware Neural Response Generation , 2016, AAAI.

[44]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[45]  Xiaodong Liu,et al.  Stochastic Answer Networks for Machine Reading Comprehension , 2017, ACL.

[46]  Enhong Chen,et al.  Chinese Poetry Generation with Planning based Neural Network , 2016, COLING.

[47]  Huanhuan Chen,et al.  Regularized Negative Correlation Learning for Neural Network Ensembles , 2009, IEEE Transactions on Neural Networks.

[48]  Bowen Zhou,et al.  Simple Question Answering by Attentive Convolutional Neural Network , 2016, COLING.

[49]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[50]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[51]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[52]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[53]  Huanhuan Chen,et al.  Multiobjective Neural Network Ensembles Based on Regularized Negative Correlation Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[54]  Lyle H. Ungar,et al.  Domain Aware Neural Dialog System , 2017, ArXiv.