Patterns of reading and organizing information in document triage

People engaged in knowledge work must often rapidly identify valuable material from within large sets of potentially relevant documents. Document triage is a type of sensemaking task that involves skimming documents to get a sense of their content, evaluating documents to assess their worth in the context of the current activity, and organizing documents to prepare for their subsequent use and more in-depth reading. We have performed a study of document triage by collecting multiple forms of qualitative and quantitative data to characterize how 24 subjects read about a new topic and assessed and organized a set of 40 relevant Web documents. Our results indicate that there are multiple strategies for document triage, each involving different styles of reading, interacting, and organizing. Common strategies include: 1) focused reading early in the task, relegating the organizing until later in the process; 2) skimming performed in tandem with organizing, which relies on gaining an incremental understanding of the topic; and 3) metadata-based organizing, a strategy that stresses working with document surrogates to minimize the time spent reading. The findings suggest ways applications may better support the intertwined nature of the browsing, reading, and organizing activities in document triage.

[1]  Allison Woodruff,et al.  Popout prism: adding perceptual principles to overview+detail document interfaces , 2002, CHI.

[2]  Konstantinos A. Meintanis,et al.  Recognizing user interest and document value from reading and organizing activities in document triage , 2006, IUI '06.

[3]  Gary Marchionini,et al.  Annotating the Web: An exploratory study of Web users' needs for personal annotation tools , 2005, ASIST.

[4]  Yan Qu A sensemaking-supporting information gathering system , 2003, CHI Extended Abstracts.

[5]  Andreas Paepcke,et al.  Digital Libraries: Searching Is Not Enough; What We Learned On-Site , 1996, D-Lib Magazine.

[6]  Asim Qayyum,et al.  Navigational characteristics of e-document readers , 2006, ASIST.

[7]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[8]  William Wright,et al.  Information Triage with TRIST , 2005 .

[9]  Frank M. Shipman,et al.  Effects of Display Configurations on Document Triage , 2005, INTERACT.

[10]  Kris Popat,et al.  A document corpus browser for in-depth reading , 2004, JCDL.

[11]  Susan Leigh Star,et al.  Sorting Things Out: Classification and Its Consequences , 1999 .

[12]  Victoria S. Uren,et al.  Sensemaking tools for understanding research literatures: Design, implementation and user evaluation , 2006, Int. J. Hum. Comput. Stud..

[13]  Eric Schwarzkopf,et al.  Enhancing the interaction with information portals , 2004, IUI '04.

[14]  H. Leibowitz,et al.  The Relation of Vergence Effort to Reports of Visual Fatigue Following Prolonged Near Work , 1990, Human factors.

[15]  Malcolm Slaney,et al.  Measuring Information Understanding in Large Document Collections , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[16]  Frank M. Shipman,et al.  Supporting personal collections across digital libraries in spatial hypertext , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[17]  Marcia J. Bates,et al.  The design of browsing and berrypicking techniques for the online search interface , 1989 .

[18]  STEVE WHITTAKER,et al.  The character, value, and management of personal paper archives , 2001, TCHI.

[19]  Frank M. Shipman,et al.  Identifying Useful Passages in Documents Based on Annotation Patterns , 2003, ECDL.

[20]  John J. Leggett,et al.  Collection understanding , 2004, JCDL.

[21]  A TyrrellRichard,et al.  The relation of vergence effort to report of visual fatigue following prolonged near work , 1990 .

[22]  Andrew Dillon,et al.  Reading and searching digital documents: An experimental analysis of the effects of image quality on user performance and perceived effort , 2005, ASIST.