Improving Exploration of Topic Hierarchies: Comparative Testing of Simplified Library of Congress Subject Heading Structures

Many large digital collections are organized by sorting their items into topics and arranging these topics hierarchically, such as those displayed in a tree view. The resulting information organization structures mitigate some of the challenges of searching digital information realms; however, the topic hierarchies are often large and complex, and thus difficult to navigate. Automated techniques have been shown to produce significantly smaller, simplified versions of existing topic hierarchies while preserving access to the majority of the collection, but these simplified topic hierarchies have never been tested with human participants, and so it is not clear what effect simplification would have on the exploration and use of such structures for browsing and retrieval. This study partly addresses this gap by performing a comparative test with three groups of university students (N=62) performing ten topic hierarchy exploration tasks using one of three versions of the Library of Congress Subject Headings (LCSH) hierarchy: 1) the original LCSH hierarchy, acting as a baseline, 2) a shallower version of 1), and 3) a narrower version of 2). A quantitative analysis of measures of accuracy, time, and browsing shows that participants using the simplified trees were significantly more accurate and faster than those using the unmodified tree, and the narrower, balanced tree was also faster than the shallower tree. These results show that automated topic hierarchy simplification can facilitate the use of such hierarchies, which has implications for the development of information organization theory and human-information interaction techniques for similar information structures.

[1]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .

[2]  M. Sheelagh T. Carpendale,et al.  Navigating tomorrow's web: From searching and browsing to visual exploration , 2012, TWEB.

[3]  Pierre Tirilly,et al.  Constructing a true LCSH tree of a science and engineering collection , 2012, J. Assoc. Inf. Sci. Technol..

[4]  D. DavisFred,et al.  User Acceptance of Computer Technology , 1989 .

[5]  Fred D. Davis Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology , 1989, MIS Q..

[6]  Gary Marchionini,et al.  Finding facts vs. browsing knowledge in hypertext systems , 1988, Computer.

[7]  Yiming Yang,et al.  Support vector machines classification with a very large-scale taxonomy , 2005, SKDD.

[8]  Lois Mai Chan,et al.  Revisiting the syntactical and structural analysis of Library of Congress Subject Headings for the digital environment , 2010 .

[9]  Fred D. Davis,et al.  User Acceptance of Computer Technology: A Comparison of Two Theoretical Models , 1989 .

[10]  Marti A. Hearst,et al.  Cat-a-Cone: an interactive interface for specifying searches and viewing retrieval results using a large category hierarchy , 1997, SIGIR '97.

[11]  Peter Pirolli,et al.  Information Foraging , 2009, Encyclopedia of Database Systems.

[12]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[13]  George Buchanan,et al.  A framework for evaluating automatic indexing or classification in the context of retrieval , 2016, J. Assoc. Inf. Sci. Technol..

[14]  F. W. Lancaster,et al.  Vocabulary control for information retrieval , 1972 .

[15]  Jeffrey Heer,et al.  DOITrees revisited: scalable, space-constrained visualization of hierarchical data , 2004, AVI.

[16]  Nina Wacholder,et al.  Evaluating the impact of MeSH (Medical Subject Headings) terms on different types of searchers , 2017, Inf. Process. Manag..

[17]  Ed H. Chi,et al.  Using information scent to model user information needs and actions and the Web , 2001, CHI.

[18]  Jun Wang,et al.  Reconstructing ddc for interactive classification , 2007, CIKM '07.

[19]  Pierre Tirilly,et al.  Reducing subject tree browsing complexity , 2013, J. Assoc. Inf. Sci. Technol..

[20]  Nihar Sheth,et al.  Visualizing MeSH Dataset using Radial Tree Layout , 2003 .

[21]  Nina Wacholder,et al.  Assessing term effectiveness in the interactive information access process , 2008, Inf. Process. Manag..

[22]  Pierre Tirilly,et al.  Exact versus estimated pruning of subject hierarchies , 2013, ASIST.

[23]  Xia Lin Visual MeSH , 1999, SIGIR '99.

[24]  L. Egghe Power Laws in the Information Production Process: Lotkaian Informetrics , 2005 .

[25]  Nicholas J. Belkin,et al.  Ask for Information Retrieval: Part I. Background and Theory , 1997, J. Documentation.

[26]  G. W. Furnas,et al.  Generalized fisheye views , 1986, CHI '86.

[27]  Koraljka Golub,et al.  Enhancing social tagging with automated keywords from the Dewey Decimal Classification , 2014, J. Documentation.

[28]  Michel Crampes,et al.  Visualizing and Interacting with Concept Hierarchies , 2013, WIMS '14.

[29]  Nicholas J. Belkin,et al.  Ask for Information Retrieval: Part II. Results of a Design Study , 1982, J. Documentation.

[30]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[31]  Ben Shneiderman,et al.  Navigating Terminology Hierarchies to Access a Digital Library of Medical Images , 1995 .