TileBars: visualization of term distribution information in full text information access

The field of information retrieval has traditionally focused on textbases consisting of titles and abstracts. As a consequence, many underlying assumptions must be altered for retrieval from full-length text collections. This paper argues for making use of text structure when retrieving from full text documents, and presents a visualization paradigm, called TileBars, that demonstrates the usefulness of explicit term distribution information in Boolean-type queries. TileBars simultaneously and compactly indicate relative document length, query term frequency, and query term distribution. The patterns in a column of TileBars can be quickly scanned and deciphered, aiding users in making judgments about the potential relevance of the retrieved documents.

[1]  B. Marx The Visual Display of Quantitative Information , 1985 .

[2]  Fredric C. Gey,et al.  Probabilistic Retrieval in the TIPSTER Collections: An Application of Staged Logistic Regression , 1992, TREC.

[3]  Richard Chimera Value bars: an information visualization and navigation tool for multi-attribute listings , 1992, CHI '92.

[4]  Edward A. Fox,et al.  Practical enhanced Boolean retrieval: Experiences with the smart and sire systems , 1988, Inf. Process. Manag..

[5]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[6]  Claude Chrisment,et al.  Querying a Hypertext Information Retrieval System by the Use of Classification , 1993, Inf. Process. Manag..

[7]  Jan O. Pedersen,et al.  An object-oriented architecture for text retrieval , 1991, RIAO.

[8]  Brewster Kahle,et al.  An information system for corporate users: wide area information servers , 1991 .

[9]  Edward R. Tufte,et al.  The Visual Display of Quantitative Information , 1986 .

[10]  John J. Bertin,et al.  The semiology of graphics , 1983 .

[11]  Alistair Moffat,et al.  Retrieval of Partial Documents , 1993, TREC.

[12]  Hans C. Arents,et al.  Concept-Based Retrieval of Hypermedia Information: From Term Indexing to Semantic Hyperindexing , 1993, Inf. Process. Manag..

[13]  S. Kosslyn Understanding Charts and Graphs: A Project in Applied Cognitive Science. , 1983 .

[14]  Donna Harman,et al.  Overview of the First Text REtrieval Conference. , 1993, SIGIR 1993.

[15]  Jock D. Mackinlay,et al.  Information visualization using 3D interactive animation , 1993, CACM.

[16]  Michael McGill,et al.  A performance evaluation of similarity measures, document term weighting schemes and representations in a Boolean environment , 1980, SIGIR '80.

[17]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[18]  Louis M. Gomez,et al.  Formative design evaluation of superbook , 1989, TOIS.

[19]  Robert R. Korfhage,et al.  To see, or not to see— is That the query? , 1991, SIGIR '91.

[20]  Martha Alice Hearst Context and structure in automated full-text information access , 1994 .

[21]  Jock D. Mackinlay Automatic design of graphical presentations , 1987 .

[22]  Anselm Spoerri,et al.  InfoCrystal: a visual tool for information retrieval & management , 1993, CIKM '93.

[23]  David R. Karger,et al.  Constant interaction-time scatter/gather browsing of very large document collections , 1993, SIGIR.

[24]  Chris Buckley,et al.  Optimizing Document Indexing and Search Term Weighting Based on Probabilistic Models , 1992, TREC.

[25]  John K. Ousterhout,et al.  An X11 Toolkit Based on the Tcl Language , 1991, USENIX Winter.

[26]  P. Fayers,et al.  The Visual Display of Quantitative Information , 1990 .

[27]  W. Bruce Croft,et al.  Text retrieval and inference , 1992 .

[28]  James D. Hollan,et al.  Edit wear and read wear , 1992, CHI.

[29]  Donna K. Harman,et al.  Overview of the first TREC conference , 1993, SIGIR.