An Intelligent Assistant for High-Level Task Understanding

People are able to interact with domain-specific intelligent assistants (IAs) and get help with tasks. But sometimes user goals are complex and may require interactions with multiple applications. However current IAs are limited to specific applications and users have to directly manage execution spanning multiple applications in order to engage in more complex activities. An ideal personal agent would be able to learn, over time, about tasks that span different resources. This paper addresses the problem of cross-domain task assistance in the context of spoken dialogue systems. We propose approaches to discover users' high-level intentions and using this information to assist users in their task. We collected real-life smartphone usage data from 14 participants and investigated how to extract high-level intents from users' descriptions of their activities. Our experiments show that understanding high-level tasks allows the agent to actively suggest apps relevant to pursuing particular user goals and reduce the cost of users' self-management.

[1]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[2]  Lin-Shan Lee,et al.  A DISTRIBUTED ARCHITECTURE FOR COOPERATIVE SPOKEN DIALOGUE AGENTS WITH COHERENT DIALOGUE STATE AND HISTORY , 2000 .

[3]  Olena Medelyan,et al.  Human-competitive automatic topic indexing , 2009 .

[4]  Alexander I. Rudnicky,et al.  Spoken language interfaces: the OM system , 1991, CHI.

[5]  A.I. Rudnicky,et al.  Constructing accurate beliefs in spoken dialog systems , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[6]  Alexander I. Rudnicky,et al.  Spoken language recognition in an office management domain , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Alexander I. Rudnicky,et al.  Matrix Factorization with Domain Knowledge and Behavioral Patterns for Intent Modeling , 2015 .

[8]  Hiroshi G. Okuno,et al.  A Two-Stage Domain Selection Framework for Extensible Multi-Domain Spoken Dialogue Systems , 2011, SIGDIAL Conference.

[9]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[10]  Alexander I. Rudnicky,et al.  Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding , 2015, ICMI.

[11]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[12]  ChengXiang Zhai,et al.  Implicit user modeling for personalized search , 2005, CIKM '05.

[13]  Gökhan Tür,et al.  Distributed open-domain conversational understanding framework with domain independent extractors , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[14]  Michael W. Berry,et al.  Text mining : applications and theory , 2010 .

[15]  Fabrizio Silvestri,et al.  Identifying task-based sessions in search engine query logs , 2011, WSDM '11.

[16]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[17]  Noah A. Smith,et al.  Toward Abstractive Summarization Using Semantic Representations , 2018, NAACL.

[18]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[19]  Ran Zhao,et al.  Towards a Dyadic Computational Model of Rapport Management for Human-Virtual Agent Interaction , 2014, IVA.

[20]  Alexander I. Rudnicky,et al.  Dynamically supporting unexplored domains in conversational interactions by enriching semantics with neural word embeddings , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[21]  Alexander I. Rudnicky,et al.  Understanding User ’ s Cross-Domain Intentions in Spoken Dialog Systems , 2015 .

[22]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[23]  Alexander I. Rudnicky,et al.  HELPR: A Framework to Break the Barrier Across Domains in Spoken Dialog Systems , 2016, IWSDS.

[24]  Alexander I. Rudnicky,et al.  Learning OOV through semantic relatedness in spoken dialog systems , 2015, INTERSPEECH.

[25]  Gary Geunbae Lee,et al.  Detecting Multiple Domains from User's Utterance in Spoken Dialog System , 2015, Natural Language Dialog Systems and Intelligent Assistants.

[26]  Alexander I. Rudnicky,et al.  Unsupervised user intent modeling by feature-enriched matrix factorization , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Robert Xiao,et al.  TouchTools: leveraging familiarity and skill with physical tools to augment touch interaction , 2014, CHI.

[28]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[29]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[30]  David G. Stork,et al.  Pattern Classification , 1973 .