Conversational query rewriting aims to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems. This paper presents a few-shot generative approach to conversational query rewriting. We develop two methods, based on rules and self-supervised learning, to generate weak supervision data using large amounts of ad hoc search sessions, and to fine-tune GPT-2 to rewrite conversational queries. On the TREC Conversational Assistance Track, our weakly supervised GPT-2 rewriter improves the state-of-the-art ranking accuracy by 12%, only using very limited amounts of manual query rewrites. In the zero-shot learning setting, the rewriter still gives a comparable result to previous state-of-the-art systems. Our analyses reveal that GPT-2 effectively picks up the task syntax and learns to capture context dependencies, even for hard cases that involve group references and long-turn dependencies.
[1]
Zhucheng Tu,et al.
Question Rewriting for Conversational Question Answering
,
2020,
ArXiv.
[2]
Chenyan Xiong,et al.
TREC CAsT 2019: The Conversational Assistance Track Overview
,
2020,
arXiv.org.
[3]
Ming-Wei Chang,et al.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
,
2019,
NAACL.
[4]
Ilya Sutskever,et al.
Language Models are Unsupervised Multitask Learners
,
2019
.
[5]
Kyunghyun Cho,et al.
Passage Re-ranking with BERT
,
2019,
ArXiv.