Exploring Hierarchical User Feedback in Email Clustering

Organizing data into hierarchies is natural for humans. However, there is little work in machine learning that explores human-machine mixed-initiative approaches to organizing data into hierarchical clusters. In this paper we consider mixed-initiative clustering of a user's email, in which the machine produces (initial and re-trained) hierarchical clusterings of email, and the user iteratively reviews and edits the hierarchical clustering, providing constraints on the next iteration of clustering. Key challenges include (a) determining types of feedback that users will find natural to provide, (b) developing hierarchical clustering and retraining algorithms capable of accepting these types of user feedback, (c) determining the correspondence between two hierarchical structures, and (d) understanding how user behavior changes during a single feedback session and designing machine strategies that change with the user. Preliminary experimental results of two cases shows that under ideal conditions, this mixed-initiative approach requires only 6 minutes of user effort to achieve email clusterings comparable to those requiring 13 to 15 minutes of manual editing efforts.