Two-Stage Voice Application Recommender System for Unhandled Utterances in Intelligent Personal Assistant

Intelligent personal assistants (IPA) enable voice applications that facilitate people’s daily tasks. However, due to the complexity and ambiguity of voice requests, some requests may not be handled properly by the standard natural language understanding (NLU) component. In such cases, a simple reply like “Sorry, I don’t know” hurts the user’s experience and limits the functionality of IPA. In this paper, we propose a two-stage shortlister-reranker recommender system to match third-party voice applications (skills) to unhandled utterances. In this approach, a skill shortlister is proposed to retrieve candidate skills from the skill catalog by calculating both lexical and semantic similarity between skills and user requests. We also illustrate how to build a new system by using observed data collected from a baseline rule-based system, and how the exposure biases can generate discrepancy between offline and human metrics. Lastly, we present two relabeling methods that can handle the incomplete ground truth, and mitigate exposure bias. We demonstrate the effectiveness of our proposed system through extensive offline experiments. Furthermore, we present online A/B testing results that show a significant boost on user experience satisfaction.

[1]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[2]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[3]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[4]  Alfred Kobsa,et al.  User Modeling and User-Adapted Interaction , 1994, User Modeling and User-Adapted Interaction.

[5]  Haibin Cheng,et al.  Real-time Personalization using Embeddings for Search Ranking at Airbnb , 2018, KDD.

[6]  Robin D. Burke,et al.  Hybrid Recommender Systems: Survey and Experiments , 2002, User Modeling and User-Adapted Interaction.

[7]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[8]  Depeng Jin,et al.  Reinforced Negative Sampling for Recommendation with Exposure Data , 2019, IJCAI.

[9]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .

[10]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[11]  Yan Feng,et al.  SamWalker: Social Recommendation with Informative Sampling Strategy , 2019, WWW.

[12]  Deborah Estrin,et al.  Unbiased offline recommender evaluation for missing-not-at-random implicit feedback , 2018, RecSys.

[13]  Barbara Plank,et al.  Strong Baselines for Neural Semi-Supervised Learning under Domain Shift , 2018, ACL.

[14]  Yinghai Lu,et al.  Deep Learning Recommendation Model for Personalization and Recommendation Systems , 2019, ArXiv.

[15]  Young-Bum Kim,et al.  A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding , 2018, NAACL.

[16]  Chang Zhou,et al.  Deep Interest Evolution Network for Click-Through Rate Prediction , 2018, AAAI.

[17]  Piyush Rai,et al.  Scalable Generative Models for Multi-label Learning with Missing Labels , 2017, ICML.

[18]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[19]  Tat-Seng Chua,et al.  Fast Matrix Factorization for Online Recommendation with Implicit Feedback , 2016, SIGIR.

[20]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[21]  Gökhan Tür,et al.  End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding , 2016, INTERSPEECH.

[22]  Inderjit S. Dhillon,et al.  Large-scale Multi-label Learning with Missing Labels , 2013, ICML.

[23]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[24]  Thorsten Joachims,et al.  Recommendations as Treatments: Debiasing Learning and Evaluation , 2016, ICML.

[25]  Guorui Zhou,et al.  Deep Interest Network for Click-Through Rate Prediction , 2017, KDD.

[26]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[27]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[28]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[29]  Tatsuya Harada,et al.  Multi-label Ranking from Positive and Unlabeled Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Ruhi Sarikaya,et al.  Contextual domain classification in spoken language understanding systems using recurrent neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  Yi Tay,et al.  Deep Learning based Recommender System: A Survey and New Perspectives , 2018 .

[32]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[33]  Young-Bum Kim,et al.  Pseudo Labeling and Negative Feedback Learning for Large-Scale Multi-Label Domain Classification , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Xiangnan He,et al.  Bias and Debias in Recommender System: A Survey and Future Directions , 2020, ACM Trans. Inf. Syst..