Towards a French Smart-Home Voice Command Corpus: Design and NLU Experiments

Despite growing interest in smart-homes, semantically annotated large voice command corpora for Natural Language development (NLU) are scarce, especially for languages other than English. In this paper, we present an approach to generate customizable synthetic corpora of semantically-annotated French commands for a smart-home. This corpus was used to train three NLU models – a triangular CRF, an attention-based RNN and the Rasa framework – evaluated using a small corpus of real users interacting with a smart home. While the attention model performs best on another large French dataset, on the small smart home corpus the models vary performance across to intent, slot and slot value classification. To the best of our knowledge, no other French corpus of semantically annotated voice commands is currently publicly available.

[1]  Quan Hung Tran,et al.  A Hierarchical Neural Model for Learning Sequences of Dialogue Acts , 2017, EACL.

[2]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[3]  Alex Acero,et al.  Semantic Frame‐Based Spoken Language Understanding , 2011 .

[4]  Gökhan Tür,et al.  Towards Zero-Shot Frame Semantic Parsing for Domain Scaling , 2017, INTERSPEECH.

[5]  Geoffrey Zweig,et al.  Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[6]  Stefan Daniel Dumitrescu Cassandra smart-home system description , 2017, SpeD.

[7]  Shinya Takahashi,et al.  Dialogue Experiment for Elderly People in Home Health Care System , 2003, TSD.

[8]  Heng Ji,et al.  Improving Slot Filling Performance with Attentive Neural Networks on Dependency Structures , 2017, EMNLP.

[9]  Gary Geunbae Lee,et al.  Multi-domain spoken language understanding with transfer learning , 2009, Speech Commun..

[10]  Michel Vacher,et al.  Context-aware decision making under uncertainty for voice-based control of smart home , 2017, Expert Syst. Appl..

[11]  Gökhan Tür,et al.  Sequential Dialogue Context Modeling for Spoken Language Understanding , 2017, SIGDIAL Conference.

[12]  Benoît Favre,et al.  Leveraging study of robustness and portability of spoken language understanding systems across languages and domains: the PORTMEDIA corpora , 2012, LREC.

[13]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[14]  Gary Geunbae Lee,et al.  Triangular-Chain Conditional Random Fields , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Brigitte Meillon,et al.  Evaluation of a Context-Aware Voice Interface for Ambient Assisted Living , 2015, ACM Trans. Access. Comput..

[16]  Maria Klara Wolters,et al.  Corpus Analysis of Spoken Smart-Home Interactions with Older Users , 2008, LREC.

[17]  Brigitte Meillon,et al.  Design and evaluation of a smart home voice interface for the elderly: acceptability and objection aspects , 2011, Personal and Ubiquitous Computing.

[18]  Brigitte Meillon,et al.  The Sweet-Home speech and multimodal corpus for home automation interaction , 2014, LREC.

[19]  Fabrice Lefèvre,et al.  Automatic Corpus Extension for Data-driven Natural Language Generation , 2016, LREC.