End-to-End Conversation Modeling : Moving beyond Chitchat DSTC 7 Task 2 Description ( v 1 . 0 )

Recent work [1, 2, 3, 4, 5, etc.] has shown that conversational models can be trained in a completely end-to-end and data-driven fashion, without any hand-coding. However, such prior work has been mostly applied to chitchat, as this is the salient trait of the social media data (e.g., Twitter [1]) utilized to train these systems. To effectively move beyond chitchat, fully data-driven models would need grounding in the real world and access to external knowledge (textual or structured), in order to produce system responses that are both substantive and “useful”. Figure 1 illustrates this desideratum: while an ideal response would directly include entities relevant to the current conversation, most state-of-the-art neural conversation models produce responses that are