Conversational QA for FAQs

The goal of this work is to access the large body of domain-specific information in the form of Frequently Asked Question sites via conversational Question Answering (QA) systems. Training systems for each possible application domain is unfeasible, calling for research on transfer learning of conversational QA systems. We present DoQA, a dataset for accessing Domain specific FAQs via conversational QA that contains 1,637 information-seeking dialogues on the cooking domain (7,329 questions in total). These dialogues are created by crowd workers that play the following two roles: the user who asks questions about a certain cooking topic posted in Stack Exchange, and the domain expert who replies to the questions by selecting a short span of text from the long textual reply in the original post. The expert can rephrase the selected span, in order to make it look more natural. Together with the dataset, we present results of state-of-the-art models, including transfer learning from Wikipedia QA datasets to our cooking FAQ dataset, and a more realistic scenario where the passage with the answer needs to be retrieved. Our dataset and experiments show that it is possible to access domain specific FAQs with high quality using conversational QA systems with little training data, thanks to transfer learning.