Helping Users Sort Faster with Adaptive Machine Learning Recommendations

Sorting and clustering large numbers of documents can be an overwhelming task: manual solutions tend to be slow, while machine learning systems often present results that don't align well with users' intents. We created and evaluated a system for helping users sort large numbers of documents into clusters. iCluster has the capability to recommend new items for existing clusters and appropriate clusters for items. The recommendations are based on a learning model that adapts over time - as the user adds more items to a cluster, the system's model improves and the recommendations become more relevant. Thirty-two subjects used iCluster to sort hundreds of data items both with and without recommendations; we found that recommendations allow users to sort items more rapidly. A pool of 161 raters then assessed the quality of the resulting clusters, finding that clusters generated with recommendations were of statistically indistinguishable quality. Both the manual and assisted methods were substantially better than a fully automatic method.

[1]  Susan T. Dumais,et al.  The spatial metaphor for user interfaces: experimental tests of reference by location versus name , 1986, TOIS.

[2]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[3]  P. Schönemann On artificial intelligence , 1985, Behavioral and Brain Sciences.

[4]  Mary Czerwinski,et al.  Data mountain: using spatial memory for document management , 1998, UIST '98.

[5]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[6]  Ravin Balakrishnan,et al.  Keepin' it real: pushing the desktop metaphor with physics, piles and the pen , 2006, CHI.

[7]  Richard Mander,et al.  A “pile” metaphor for supporting casual organization of information , 1992, CHI.

[8]  Takeo Igarashi,et al.  Bubble clusters: an interface for manipulating spatial aggregation of graphical objects , 2007, UIST.

[9]  Marti A. Hearst,et al.  Scatter/gather browsing communicates the topic structure of a very large text collection , 1996, CHI.

[10]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[11]  James D. Hollan,et al.  Spatial Tools for Managing Personal Information Collections , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  Steven M. Drucker,et al.  Assisting Users with Clustering Tasks by Combining Metric Learning and Classification , 2010, AAAI.

[14]  Thomas W. Malone,et al.  How do people organize their desks?: Implications for the design of office information systems , 1983, TOIS.

[15]  STEVE WHITTAKER,et al.  The character, value, and management of personal paper archives , 2001, TCHI.

[16]  Marie desJardins,et al.  Interactive visual clustering , 2007, IUI '07.

[17]  Mary Czerwinski,et al.  Visualizing implicit queries for information management and retrieval , 1999, CHI '99.

[18]  Tom Rodden,et al.  Building bridges: customisation and mutual intelligibility in shared category management , 1999, GROUP.