论文信息 - Research data supporting "Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems"

Research data supporting "Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems"

This dataset is in JSON format and contains log files of interactions between a turn-taking spoken dialogue system and Amazon Mechanical turkers, collected from our previous live trials. It includes two application domains: San Francisco restaurants and hotels, each of them has around 1000 logs. The user responses are 1-best ASR hypothesis recognised by our ASR system, and the system responses were collected by running another round of data collection on AMT. The number of total collected system responses is around 5.1K for each domain. All users are anonymous.