Analysis of large digital collections with interactive visualization

To make decisions about the long-term preservation and access of large digital collections, archivists gather information such as the collections' contents, their organizational structure, and their file format composition. To date, the process of analyzing a collection — from data gathering to exploratory analysis and final conclusions — has largely been conducted using pen and paper methods. To help archivists analyze large-scale digital collections for archival purposes, we developed an interactive visual analytics application. The application narrows down different kinds of information about the collection, and presents them as meaningful data views. Multiple views and analysis features can be linked or unlinked on demand to enable researchers to compare and contrast different analyses, and to identify trends. We describe and present two user scenarios to show how the application allowed archivists to learn about a collection with accuracy, facilitated decision-making, and helped them arrive at conclusions.

[1]  Jean-Daniel Fekete The InfoVis Toolkit , 2004 .

[2]  Daniel A. Keim,et al.  Designing Pixel-Oriented Visualization Techniques: Theory and Applications , 2000, IEEE Trans. Vis. Comput. Graph..

[3]  Jarke J. van Wijk,et al.  Squarified Treemaps , 2000, VisSym.

[4]  Robert F. Erbacher Glyph-based generic network visualization , 2002, IS&T/SPIE Electronic Imaging.

[5]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[6]  Envisioning the Archival Commons , 2009 .

[7]  Weijia Xu,et al.  Assessing the Preservation Condition of Large and Heterogeneous Electronic Records Collections with Visualization , 2011, Int. J. Digit. Curation.

[8]  C. Fellbaum An Electronic Lexical Database , 1998 .

[9]  Ben Shneiderman,et al.  Discovering interesting usage patterns in text collections: integrating text mining with visualization , 2007, CIKM '07.

[10]  Ben Shneiderman,et al.  Visualization methods for personal photo collections: browsing and searching in the PhotoFinder , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[11]  P Dryburgh,et al.  Calendar of Documents Relating to Medieval Ireland in the Series of Ancient Deeds in the National Archives of the United Kingdom , 2006 .

[12]  John T. Stasko,et al.  Jigsaw: Supporting Investigative Analysis through Interactive Visualization , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[13]  Tetsuya Nasukawa,et al.  Text analysis and knowledge mining system , 2001, IBM Syst. J..

[14]  Gary Marsden,et al.  Using treemaps to visualize threaded discussion forums on PDAs , 2005, CHI EA '05.

[15]  Martin Fowler,et al.  Patterns of Enterprise Application Architecture , 2002 .

[16]  Jozo Ivanović Appraisal of electronic records , 1999 .

[17]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[18]  Martin Wattenberg,et al.  The Word Tree, an Interactive Visual Concordance , 2008, IEEE Transactions on Visualization and Computer Graphics.

[19]  Richard J. Cox,et al.  Appraising the Digital Past and Future , 2007 .

[20]  Martin Wattenberg,et al.  TIMELINESTag clouds and the case for vernacular visualization , 2008, INTR.

[21]  Jennifer Meehan,et al.  Making the Leap from Parts to Whole: Evidence and Inference in Archival Arrangement and Description , 2009 .

[22]  Michael Chau,et al.  Visualizing web search results using glyphs: Design and evaluation of a flower metaphor , 2011, TMIS.

[23]  Lei Shi,et al.  Understanding text corpora with multiple facets , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[24]  Ramana Rao,et al.  Visualizing large trees using the hyperbolic browser , 1996, CHI Conference Companion.

[25]  Jarke J. van Wijk,et al.  Cushion Treemaps: Visualization of Hierarchical Information , 1999, INFOVIS.

[26]  J. Stasko,et al.  Focus+context display and navigation techniques for enhancing radial, space-filling hierarchy visualizations , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[27]  James D. Foley,et al.  ResultMaps: Visualization for Search Interfaces , 2009, IEEE Transactions on Visualization and Computer Graphics.

[28]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[29]  Jeffrey Heer,et al.  prefuse: a toolkit for interactive information visualization , 2005, CHI.