Effective Slot Filling via Weakly-Supervised Dual-Model Learning

Slot filling is a challenging task in Spoken Language Understanding (SLU). Supervised methods usually require large amounts of annotation to maintain desirable performance. A solution to relieve the heavy dependency on labeled data is to employ bootstrapping, which leverages unlabeled data. However, bootstrapping is known to suffer from semantic drift. We argue that semantic drift can be tackled by exploiting the correlation between slot values (phrases) and their respective types. By using some particular weakly-labeled data, namely the plain phrases included in sentences, we propose a weaklysupervised slot filling approach. Our approach trains two models, namely a classifier and a tagger, which can effectively learn from each other on the weakly-labeled data. The experimental results demonstrate that our approach achieves better results than standard baselines on multiple datasets, especially in the low-resource setting.

[1]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[2]  Geoffrey Zweig,et al.  Recurrent neural networks for language understanding , 2013, INTERSPEECH.

[3]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[4]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[5]  Giuseppe Riccardi,et al.  Generative and discriminative algorithms for spoken language understanding , 2007, INTERSPEECH.

[6]  Huchuan Lu,et al.  Deep Mutual Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Varvara Logacheva,et al.  Few-shot classification in named entity recognition task , 2018, SAC.

[8]  Gökhan Tür,et al.  End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding , 2016, INTERSPEECH.

[9]  Gökhan Tür,et al.  What is left to be understood in ATIS? , 2010, 2010 IEEE Spoken Language Technology Workshop.

[10]  Chih-Li Huo,et al.  Slot-Gated Modeling for Joint Slot Filling and Intent Prediction , 2018, NAACL.

[11]  Geoffrey Zweig,et al.  Spoken language understanding using long short-term memory neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[12]  Francesco Caltagirone,et al.  Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces , 2018, ArXiv.

[13]  J. Curran,et al.  Minimising semantic drift with Mutual Exclusion Bootstrapping , 2007 .

[14]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[15]  Yoshua Bengio,et al.  Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[16]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  George R. Doddington,et al.  The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[18]  Geoffrey Zweig,et al.  Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  Katrin Kirchhoff,et al.  Simple, Fast, Accurate Intent Classification and Slot Labeling for Goal-Oriented Dialogue Systems , 2019, SIGdial.

[20]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[21]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[22]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[23]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[24]  Meina Song,et al.  A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling , 2019, ACL.

[25]  James R. Glass,et al.  Asgard: A portable architecture for multilingual dialogue systems , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[27]  Yun-Nung Chen,et al.  Dual Supervised Learning for Natural Language Understanding and Generation , 2019, ACL.

[28]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[29]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[30]  Roland Vollgraf,et al.  FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP , 2019, NAACL.

[31]  Philip S. Yu,et al.  Joint Slot Filling and Intent Detection via Capsule Neural Networks , 2018, ACL.

[32]  Tolga Tasdizen,et al.  Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[33]  Ruslan Salakhutdinov,et al.  Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks , 2016, ICLR.

[34]  Sungjin Lee,et al.  Zero-Shot Adaptive Transfer for Conversational Language Understanding , 2018, AAAI.

[35]  Gökhan Tür,et al.  Towards Zero-Shot Frame Semantic Parsing for Domain Scaling , 2017, INTERSPEECH.

[36]  Dilek Z. Hakkani-Tür,et al.  Robust Zero-Shot Cross-Domain Slot Filling with Example Values , 2019, ACL.

[37]  Jiawei Han,et al.  Automated Phrase Mining from Massive Text Corpora , 2017, IEEE Transactions on Knowledge and Data Engineering.

[38]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[39]  Wen Wang,et al.  BERT for Joint Intent Classification and Slot Filling , 2019, ArXiv.