A hybrid learning system for recognizing user tasks from desktop activities and email messages

The TaskTracer system seeks to help multi-tasking users manage the resources that they create and access while carrying out their work activities. It does this by associating with each user-defined activity the set of files, folders, email messages, contacts, and web pages that the user accesses when performing that activity. The initial TaskTracer system relies on the user to notify the system each time the user changes activities. However, this is burdensome, and users often forget to tell TaskTracer what activity they are working on. This paper introduces TaskPredictor, a machine learning system that attempts to predict the user's current activity. TaskPredictor has two components: one for general desktop activity and another specifically for email. TaskPredictor achieves high prediction precision by combining three techniques: (a) feature selection via mutual information, (b) classification based on a confidence threshold, and (c) a hybrid design in which a Naive Bayes classifier estimates the classification confidence but where the actual classification decision is made by a support vector machine. This paper provides experimental results on data collected from TaskTracer users.

[1]  Eric Horvitz,et al.  Learning and reasoning about interruption , 2003, ICMI '03.

[2]  Thomas G. Dietterich,et al.  TaskTracer: a desktop environment to support multi-tasking knowledge workers , 2005, IUI.

[3]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[4]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[5]  William W. Cohen Learning Rules that Classify E-Mail , 1996 .

[6]  H. Yanco,et al.  Automation as Caregiver: A Survey of Issues and Technologies , 2003 .

[7]  Matthai Philipose,et al.  The Probabilistic Activity Toolkit: Towards Enabling Activity-Aware Computer Interfaces , 2003 .

[8]  Daniel H. Grollman,et al.  Astrology: The Study of Astro Teller , 2004 .

[9]  Jeffrey O. Kephart,et al.  MailCat: an intelligent assistant for organizing e-mail , 1999, AGENTS '99.

[10]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[11]  Christopher Meek,et al.  Challenges of the Email Domain for Text Classification , 2000, ICML.

[12]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[13]  Tom Fawcett,et al.  Activity monitoring: noticing interesting changes in behavior , 1999, KDD '99.

[14]  Eric Horvitz,et al.  Attention-Sensitive Alerting , 1999, UAI.

[15]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[16]  Andrew McCallum,et al.  Automatic Categorization of Email into Folders: Benchmark Experiments on Enron and SRI Corpora , 2005 .

[17]  Eric Horvitz,et al.  The Lumière Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users , 1998, UAI.

[18]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[19]  Michael I. Jordan,et al.  Probabilistic Independence Networks for Hidden Markov Probability Models , 1997, Neural Computation.