SUGILITE: Creating Multimodal Smartphone Automation by Demonstration

SUGILITE is a new programming-by-demonstration (PBD) system that enables users to create automation on smartphones. SUGILITE uses Android's accessibility API to support automating arbitrary tasks in any Android app (or even across multiple apps). When the user gives verbal commands that SUGILITE does not know how to execute, the user can demonstrate by directly manipulating the regular apps' user interface. By leveraging the verbal instructions, the demonstrated procedures, and the apps? UI hierarchy structures, SUGILITE can automatically generalize the script from the recorded actions, so SUGILITE learns how to perform tasks with different variations and parameters from a single demonstration. Extensive error handling and context checking support forking the script when new situations are encountered, and provide robustness if the apps change their user interface. Our lab study suggests that users with little or no programming knowledge can successfully automate smartphone tasks using SUGILITE.

[1]  Tessa A. Lau,et al.  Programming by demonstration: an inductive learning formulation , 1998, IUI '99.

[2]  Pattie Maes,et al.  Agents that reduce work and information overload , 1994, CACM.

[3]  Atsushi Nakazawa,et al.  Learning from Observation Paradigm: Leg Task Models for Enabling a Biped Humanoid Robot to Imitate Human Dances , 2007, Int. J. Robotics Res..

[4]  James D. Foley,et al.  Model-based user interface design by example and by interview , 1993, UIST '93.

[5]  Nathanael Chambers,et al.  PLOW: A Collaborative Task Learning Agent , 2007, AAAI.

[6]  Ville Antila,et al.  RoutineMaker: Towards end-user automation of daily routines using smartphones , 2012, 2012 IEEE International Conference on Pervasive Computing and Communications Workshops.

[7]  Alexander I. Rudnicky,et al.  HELPR: A Framework to Break the Barrier Across Domains in Spoken Dialog Systems , 2016, IWSDS.

[8]  Imed Zitouni,et al.  Automatic Online Evaluation of Intelligent Assistants , 2015, WWW.

[9]  Eben M. Haber,et al.  CoScripter: automating & sharing how-to knowledge in the enterprise , 2008, CHI.

[10]  Raymond J. Mooney,et al.  Learning to Interpret Natural Language Navigation Instructions from Observations , 2011, Proceedings of the AAAI Conference on Artificial Intelligence.

[11]  James D. Foley,et al.  A pure reasoning engine for programming by demonstration , 1994, UIST '94.

[12]  Brad A. Myers,et al.  Creating user interfaces using programming by example, visual programming, and constraints , 1990, TOPL.

[13]  André Rodrigues Breaking Barriers with Assistive Macros , 2015, ASSETS.

[14]  Henry Lieberman,et al.  Watch what I do: programming by demonstration , 1993 .

[15]  Tessa A. Lau,et al.  DocWizards: a system for authoring follow-me documentation wizards , 2005, UIST.

[16]  Krzysztof Z. Gajos,et al.  SUPPLE: automatically generating user interfaces , 2004, IUI '04.

[17]  Tessa A. Lau,et al.  The Case Studies: Three Systems Why Programming by Demonstration Systems Fail: Lessons Learned for Usable Ai , 2022 .

[18]  Takeo Igarashi,et al.  Generating photo manipulation tutorials by demonstration , 2009, ACM Trans. Graph..

[19]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[20]  Zhaohui Wu,et al.  Discovering different kinds of smartphone users through their application usage behaviors , 2016, UbiComp.

[21]  Roger B. Dannenberg,et al.  Creating graphical interactive application objects by demonstration , 1989, UIST '89.

[22]  Brad A. Myers,et al.  Getting more out of programming-by-demonstration , 1999, CHI '99.

[23]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.

[24]  Amos Azaria,et al.  InstructableCrowd: Creating IF-THEN Rules via Conversations with the Crowd , 2016, CHI Extended Abstracts.

[25]  Hari Balakrishnan,et al.  Code in the air: simplifying sensing and coordination tasks on smartphones , 2012, HotMobile '12.

[26]  Amos Azaria,et al.  Instructable Intelligent Personal Agent , 2016, AAAI.

[27]  Simone D. J. Barbosa,et al.  Keep doing what i just did: automating smartphones by demonstration , 2013, MobileHCI '13.

[28]  Peter Stone,et al.  Learning to Interpret Natural Language Commands through Human-Robot Dialog , 2015, IJCAI.

[29]  B. A. Myers,et al.  Visual programming, programming by example, and program visualization: a taxonomy , 1986, CHI '86.

[30]  Nikolay Mehandjiev,et al.  Exploring Mobile End User Development: Existing Use and Design Factors , 2016, IEEE Transactions on Software Engineering.

[31]  Alexander I. Rudnicky,et al.  Matrix Factorization with Domain Knowledge and Behavioral Patterns for Intent Modeling , 2015 .

[32]  Rob Miller,et al.  Automation and customization of rendered web pages , 2005, UIST.

[33]  Rob Miller,et al.  Sikuli: using GUI screenshots for search and automation , 2009, UIST '09.

[34]  Kenneth M. Kahn,et al.  ToonTalk - An Animated Programming Environment for Children , 1996, J. Vis. Lang. Comput..

[35]  Jiun-Hung Chen,et al.  Recovering from errors during programming by demonstration , 2008, IUI '08.

[36]  Ben Shneiderman,et al.  Designing the User Interface: Strategies for Effective Human-Computer Interaction , 1998 .

[37]  Brad A. Myers,et al.  Pursuit: graphically representing programs in a demonstrational visual shell , 1994, CHI Conference Companion.

[38]  Brad A. Myers Graphical techniques in a spreadsheet for specifying user interfaces , 1991, CHI '91.

[39]  Brad A. Myers,et al.  Gamut: demonstrating whole applications , 1997, UIST '97.

[40]  Ben Shneiderman,et al.  Designing the User Interface: Strategies for Effective Human-Computer Interaction, 6th Edition , 2016 .