Discovering frequent work procedures from resource connections

Intelligent desktop assistants could provide more help for users if they could learn models of the users' workflows. However, discovering desktop workflows is difficult because they unfold over extended periods of time (days or weeks) and they are interleaved with many other workflows because of user multi-tasking. This paper describes an approach to discovering desktop workflows based on rich instrumentation of information flow actions such as copy/paste, SaveAs, file copy, attach file to email message, and save attachment. These actions allow us to construct a graph whose nodes are files, email messages, and web pages and whose edges are these information flow actions. A class of workflows that we call work procedures can be discovered by applying graph mining algorithms to find frequent subgraphs. This paper describes an algorithm for mining frequent closed connected subgraphs and then describes the results of applying this method to data collected from a group of real users.

[1]  Derek Scott Lam,et al.  Exploiting E-mail Structure to Improve Summarization , 2002 .

[2]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[3]  Eamonn J. Keogh,et al.  Probabilistic discovery of time series motifs , 2003, KDD '03.

[4]  Dimitrios Gunopulos,et al.  Mining Process Models from Workflow Logs , 1998, EDBT.

[5]  Thomas G. Dietterich,et al.  TaskTracer: a desktop environment to support multi-tasking knowledge workers , 2005, IUI.

[6]  Irfan A. Essa,et al.  Unsupervised Activity Discovery and Characterization From Event-Streams , 2005, UAI.

[7]  Dominique L. Scapin,et al.  What do people recall about their documents?: implications for desktop search tools , 2007, IUI '07.

[8]  Kuniaki Uehara,et al.  Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle , 2005, Machine Learning.

[9]  Mark Dredze,et al.  Automatically classifying emails into activities , 2006, IUI '06.

[10]  Wil M. P. van der Aalst,et al.  Workflow Mining: Current Status and Future Directions , 2003, OTM.

[11]  James Frew,et al.  Automatic capture and reconstruction of computational provenance , 2008, Concurr. Comput. Pract. Exp..

[12]  Jessica Lin,et al.  Finding Motifs in Time Series , 2002, KDD 2002.

[13]  Eben M. Haber,et al.  CoScripter: automating & sharing how-to knowledge in the enterprise , 2008, CHI.

[14]  Eleni Stroulia,et al.  From run-time behavior to usage scenarios: an interaction-pattern mining approach , 2002, KDD.

[15]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[16]  Ian Smith,et al.  Taking email to task: the design and evaluation of a task management centered email tool , 2003, CHI '03.

[17]  Tom M. Mitchell,et al.  Extracting Knowledge about Users' Activities from Raw Workstation Contents , 2006, AAAI.

[18]  Víctor M. González,et al.  No task left behind?: examining the nature of fragmented work , 2005, CHI.

[19]  Irfan A. Essa,et al.  Discovering Multivariate Motifs using Subsequence Density Estimation and Greedy Mixture Learning , 2007, AAAI.

[20]  Domenico Saccà,et al.  Mining and reasoning on workflows , 2005, IEEE Transactions on Knowledge and Data Engineering.

[21]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[22]  Thomas G. Dietterich,et al.  Real-Time Detection of Task Switches of Desktop Users , 2007, IJCAI.

[23]  Margaret M. Burnett,et al.  Mining Interpretable Human Strategies: A Case Study , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[24]  Brian D. Noble,et al.  Using Provenance to Aid in Personal File Search , 2007, USENIX Annual Technical Conference.

[25]  Tessa A. Lau,et al.  Automated email activity management: an unsupervised learning approach , 2005, IUI.

[26]  Cristina Conati,et al.  Unsupervised and supervised machine learning in user modeling for intelligent learning environments , 2007, IUI '07.

[27]  Joaquim A. Jorge,et al.  Describing documents: what can users tell us? , 2004, IUI '04.

[28]  Weng-Keen Wong,et al.  Logical Hierarchical Hidden Markov Models for Modeling User Activities , 2008, ILP.

[29]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.

[30]  Irfan A. Essa,et al.  Improving Activity Discovery with Automatic Neighborhood Estimation , 2007, IJCAI.