BERT Embeddings Can Track Context in Conversational Search

The use of conversational assistants to search for information is becoming increasingly more popular among the general public, pushing the research towards more advanced and sophisticated techniques. In the last few years, in particular, the interest in conversational search is increasing, not only because of the generalization of conversational assistants but also because conversational search is a step forward in allowing a more natural interaction with the system. In this work, the focus is on exploring the context present of the conversation via the historical utterances and respective embeddings with the aim of developing a conversational search system that helps people search for information in a natural way. In particular, this system must be able to understand the context where the question is posed, tracking the current state of the conversation and detecting mentions to previous questions and answers. We achieve this by using a context-tracking component based on neural query-rewriting models. Another crucial aspect of the system is to provide the most relevant answers given the question and the conversational history. To achieve this objective, we used a Transformer-based re-ranking method and expanded this architecture to use the conversational context. The results obtained with the system developed showed the advantages of using the context present in the natural language utterances and in the neural embeddings generated throughout the conversation.

[1]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[2]  Kyunghyun Cho,et al.  Passage Re-ranking with BERT , 2019, ArXiv.

[3]  Charles L. A. Clarke,et al.  Exploring Conversational Search With Humans, Assistants, and Wizards , 2017, CHI Extended Abstracts.

[4]  Jamie Callan,et al.  TREC CAsT 2019: The Conversational Assistance Track Overview , 2020, ArXiv.

[5]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[6]  Jimmy J. Lin,et al.  Query and Answer Expansion from Conversation History , 2019, TREC.

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Eunsol Choi,et al.  QuAC: Question Answering in Context , 2018, EMNLP.

[12]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[13]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[14]  Jimmy J. Lin,et al.  Document Ranking with a Pretrained Sequence-to-Sequence Model , 2020, FINDINGS.

[15]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[16]  Jordan Boyd-Graber,et al.  Can You Unpack That? Learning to Rewrite Questions-in-Context , 2019, EMNLP.

[17]  Charles L. A. Clarke,et al.  WaterlooClarke at the TREC 2021 Conversational Assistant Track , 2021, TREC.

[18]  Zhiyuan Liu,et al.  Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search , 2018, WSDM.

[19]  Jimmy J. Lin,et al.  Anserini: Enabling the Use of Lucene for Information Retrieval Research , 2017, SIGIR.

[20]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[21]  Zhiyuan Liu,et al.  Understanding the Behaviors of BERT in Ranking , 2019, ArXiv.

[22]  Fabio Crestani,et al.  Predicting Relevant Conversation Turns for Improved Retrieval in Multi-Turn Conversational Search , 2019, TREC.

[23]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[24]  José M. F. Moura,et al.  Visual Dialog , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Zhiyuan Liu,et al.  End-to-End Neural Ad-hoc Ranking with Kernel Pooling , 2017, SIGIR.

[26]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[27]  Paul N. Bennett,et al.  Few-Shot Generative Conversational Query Rewriting , 2020, SIGIR.

[28]  Jimmy J. Lin,et al.  Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models , 2020, ArXiv.

[29]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[32]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[33]  M. de Rijke,et al.  Query Resolution for Conversational Search with Limited Supervision , 2020, SIGIR.

[34]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[35]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[36]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[37]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[38]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[39]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .