Automatic Discovery of Personal Topics to Organize Email

We present in this paper a procedure to automatically discover a user s personal topics by clustering their emails. Unlike previous work, we automatically label topics using appropriate keywords. We show that, in order to get appropriate keywords, we must apply strong filters that use domain knowledge about e-mail and the workplace of the user. We demonstrate these keywords by creating an email/ document browser which makes use of these keywords as standing queries to create virtual folders that help organize, index and retrieve email efficiently. We present subjective user studies to show the usefulness of the strong filtering.

[1]  Anoop Gupta,et al.  Supporting Email Workflow , 2001 .

[2]  Wendy E. Mackay,et al.  More than just a communication system: diversity in the use of electronic mail , 1988, CSCW '88.

[3]  David D. Lewis,et al.  Representation and Learning in Information Retrieval , 1991 .

[4]  Tom M. Mitchell,et al.  Inferring Ongoing Activities of Workstation Users by Clustering Email , 2004, CEAS.

[5]  Olle Bälter,et al.  Bifrost inbox organizer: giving users control over the inbox , 2002, NordiCHI '02.

[6]  Gary Boone,et al.  Concept features in Re:Agent, an intelligent Email agent , 1998, AGENTS '98.

[7]  David R. Karger,et al.  Scatter/Gather: a cluster-based approach to browsing large document collections , 1992, SIGIR '92.

[8]  Paul Dourish,et al.  Presto: an experimental architecture for fluid interactive document spaces , 1999, TCHI.

[9]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[10]  Kenrick J. Mock An experimental framework for email categorization and management , 2001, SIGIR '01.

[11]  Changning Huang,et al.  A Unified Statistical Model for the Identification of English BaseNP , 2000, ACL.

[12]  Ian Smith,et al.  Taking email to task: the design and evaluation of a task management centered email tool , 2003, CHI '03.

[13]  Jeffrey O. Kephart,et al.  MailCat: an intelligent assistant for organizing e-mail , 1999, AGENTS '99.

[14]  Candace L. Sidner,et al.  Email overload: exploring personal information management of email , 1996, CHI.