VisInfo: a digital library system for time series research data based on exploratory search—a user-centered design approach

To this day, data-driven science is a widely accepted concept in the digital library (DL) context (Hey et al. in The fourth paradigm: data-intensive scientific discovery. Microsoft Research, 2009). In the same way, domain knowledge from information visualization, visual analytics, and exploratory search has found its way into the DL workflow. This trend is expected to continue, considering future DL challenges such as content-based access to new document types, visual search, and exploration for information landscapes, or big data in general. To cope with these challenges, DL actors need to collaborate with external specialists from different domains to complement each other and succeed in given tasks such as making research data publicly available. Through these interdisciplinary approaches, the DL ecosystem may contribute to applications focused on data-driven science and digital scholarship. In this work, we present VisInfo (2014) , a web-based digital library system (DLS) with the goal to provide visual access to time series research data. Based on an exploratory search (ES) concept (White and Roth in Synth Lect Inf Concepts Retr Serv 1(1):1–98, 2009), VisInfo at first provides a content-based overview visualization of large amounts of time series research data. Further, the system enables the user to define visual queries by example or by sketch. Finally, VisInfo presents visual-interactive capability for the exploration of search results. The development process of VisInfo was based on the user-centered design principle. Experts from computer science, a scientific digital library, usability engineering, and scientists from the earth, and environmental sciences were involved in an interdisciplinary approach. We report on comprehensive user studies in the requirement analysis phase based on paper prototyping, user interviews, screen casts, and user questionnaires. Heuristic evaluations and two usability testing rounds were applied during the system implementation and the deployment phase and certify measurable improvements for our DLS. Based on the lessons learned in VisInfo, we suggest a generalized project workflow that may be applied in related, prospective approaches.

[1]  Heiko Schuldt,et al.  The Delos digital library reference model : foundations for digital libraries , 2007 .

[2]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[3]  Les Carr,et al.  Enhancing access to research data: the challenge of crystallography , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[4]  Petr Knoth,et al.  Visual Search for Supporting Content Exploration in Large Document Collections , 2012, D Lib Mag..

[5]  Bonnie E. John Evaluating usability evaluation techniques , 1996, CSUR.

[6]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[7]  Iraklis Varlamis,et al.  How to Become a Group Leader? or Modeling Author Types Based on Graph Mining , 2011, TPDL.

[8]  Richard K. Johnson Open Access , 2005 .

[9]  Helwig Hauser,et al.  Visualization of Multi‐Variate Scientific Data , 2009, Comput. Graph. Forum.

[10]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[11]  Chris Weaver,et al.  Steerable Clustering for Visual Analysis of Ecosystems , 2011, EuroVA@EuroVis.

[12]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[13]  Judy Jeng Usability evaluation of academic digital libraries: From the perspectives of effectiveness, efficiency, satisfaction, and learnability , 2004 .

[14]  Nicholas J. Belkin,et al.  Evaluating interactive information retrieval systems: opportunities and challenges , 2004, CHI EA '04.

[15]  J. Jeng Usability of digital libraries: an evaluation model , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[16]  Tony Hey,et al.  The Fourth Paradigm , 2009 .

[17]  Ian J. Taylor,et al.  Workflows and e-Science: An overview of workflow system features and capabilities , 2009, Future Gener. Comput. Syst..

[18]  Tefko Saracevic,et al.  Digital Library Evaluation: Toward Evolution of Concepts , 2000, Libr. Trends.

[19]  Ryen W. White,et al.  Exploratory Search: Beyond the Query-Response Paradigm , 2009, Exploratory Search: Beyond the Query-Response Paradigm.

[20]  Xiaojun Yuan,et al.  Testing visualization on the use of information systems , 2010, IIiX.

[21]  FuTak-chung A review on time series data mining , 2011 .

[22]  Noel Enyedy,et al.  Building Digital Libraries for Scientific Data: An Exploratory Study of Data Practices in Habitat Ecology , 2006, ECDL.

[23]  J. V. van Wijk,et al.  Cluster and calendar based visualization of time series data , 1999, Proceedings 1999 IEEE Symposium on Information Visualization (InfoVis'99).

[24]  Tobias Schreck,et al.  Retrieval and exploratory search in multivariate research data repositories using regressional features , 2011, JCDL '11.

[25]  Jürgen Bernard,et al.  Visual‐interactive Exploration of Interesting Multivariate Relations in Mixed Research Data Sets , 2014, Comput. Graph. Forum.

[26]  Jürgen Bernard,et al.  Bridging Knowledge Gaps in Policy Analysis with Information Visualization , 2013, EGOV/ePart Ongoing Research.

[27]  Tobias Schreck,et al.  MotionExplorer: Exploratory Search in Human Motion Capture Data Based on Hierarchical Aggregation , 2013, IEEE Transactions on Visualization and Computer Graphics.

[28]  Tamara Munzner,et al.  Design Study Methodology: Reflections from the Trenches and the Stacks , 2012, IEEE Transactions on Visualization and Computer Graphics.

[29]  Catherine Plaisant,et al.  The challenge of information visualization evaluation , 2004, AVI.

[30]  Jakob Nielsen,et al.  The usability engineering life cycle , 1992, Computer.

[31]  Cathleen Wharton,et al.  Testing a walkthrough methodology for theory-based design of walk-up-and-use interfaces , 1990, CHI '90.

[32]  T. Davenport,et al.  Data scientist: the sexiest job of the 21st century. , 2012, Harvard business review.

[33]  T. Nocke,et al.  Visualization of Climate and Climate Change Data : An Overview , 2008 .

[34]  Anthony J. G. Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery [Point of View] , 2011 .

[35]  Bradley M. Hemminger,et al.  Scientific data repositories on the Web: An initial survey , 2010 .

[36]  Bertram Ludäscher,et al.  Managing scientific data: From data integration to scientific workflows* , 2006 .

[37]  Brian Shackel,et al.  Human factors for informatics usability , 1991 .

[38]  Tamara Munzner,et al.  A Nested Model for Visualization Design and Validation , 2009, IEEE Transactions on Visualization and Computer Graphics.

[39]  Tobias Schreck,et al.  Guided discovery of interesting relationships between time series clusters and metadata properties , 2012, i-KNOW '12.

[40]  Jürgen Bernard,et al.  Visual-Interactive Preprocessing of Time Series Data , 2012, SIGRAD.

[41]  Thomas J. Steenburgh,et al.  Motivating Salespeople: What Really Works , 2012, Harvard business review.

[42]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[43]  Pauline Ohren The DELOS : Digital Library Reference Model as a basis for describing and evaluating search systems , 2009 .

[44]  Heidrun Schumann,et al.  Visualization of Time-Oriented Data , 2011, Human-Computer Interaction Series.

[45]  B. McArthur,et al.  Baseline surface radiation network (BSRN/WCRP) New precision radiometry for climate research , 1998 .

[46]  Jan Brase,et al.  DataCite - A Global Registration Agency for Research Data , 2009, 2009 Fourth International Conference on Cooperation and Promotion of Information Resources in Science and Technology.

[47]  MunznerTamara A Nested Model for Visualization Design and Validation , 2009 .

[48]  Marti A. Hearst Search User Interfaces , 2009 .

[49]  Thomas C. Reeves,et al.  Evaluating digital libraries , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[50]  Helwig Hauser,et al.  Visualization and Visual Analysis of Multifaceted Scientific Data: A Survey , 2013, IEEE Transactions on Visualization and Computer Graphics.

[51]  Tobias Schreck,et al.  Visual Cluster Analysis of Trajectory Data with Interactive Kohonen Maps , 2008, 2008 IEEE Symposium on Visual Analytics Science and Technology.

[52]  Jakob Nielsen,et al.  Heuristic evaluation of user interfaces , 1990, CHI '90.

[53]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[54]  Daniel A. Keim,et al.  Visual market sector analysis for financial time series data , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[55]  Thomas Nocke,et al.  Information Visualization in Climate Research , 2011, 2011 15th International Conference on Information Visualisation.

[56]  Daniel A. Keim,et al.  Mastering the Information Age - Solving Problems with Visual Analytics , 2010 .

[57]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[58]  Jeffrey Rubin,et al.  Handbook of Usability Testing: How to Plan, Design, and Conduct Effective Tests , 1994 .

[59]  Jakob Nielsen,et al.  A mathematical model of the finding of usability problems , 1993, INTERCHI.

[60]  Gert König-Langlo,et al.  Time-oriented earth observation measurements from the Baseline Surface Radiation Network (BSRN) in the years 1992 to 2012, reference list of 6813 datasets , 2012 .

[61]  Verena Kantere,et al.  Managing scientific data , 2010, Commun. ACM.

[62]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[63]  Brian Shackel,et al.  Usability - Context, framework, definition, design and evaluation , 1991, Interact. Comput..

[64]  Lincoln D. Stein,et al.  Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges , 2008, Nature Reviews Genetics.

[65]  Tobias Schreck,et al.  TimeSeriesPaths : Projection-Based Explorative Analysis of Multivariate Time Series Data , 2012, WSCG 2012.

[66]  Tobias Schreck,et al.  Content-based layouts for exploratory metadata search in scientific research data , 2012, JCDL '12.

[67]  Judy Jeng,et al.  Usability Assessment of Academic Digital Libraries: Effectiveness, Efficiency, Satisfaction, and Learnability , 2005 .

[68]  Michael Gertz,et al.  Event-centric search and exploration in document collections , 2012, JCDL '12.

[69]  Denise Nunes Pithan,et al.  Usability of digital libraries: A study based on the areas of information science and human-computer-interaction , 2005, OCLC Syst. Serv..

[70]  Irina Sens,et al.  A Visual Digital Library Approach for Time-Oriented Scientific Primary Data , 2010, ECDL.

[71]  Terence R. Smith,et al.  Alexandria digital library: user evaluation studies and system design , 2000 .

[72]  Steve Pettifer,et al.  Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web , 2008, PLoS Comput. Biol..

[73]  Orland Hoeber,et al.  User Evaluation Methods for Visual Web Search Interfaces , 2009, 2009 13th International Conference Information Visualisation.

[74]  Ben Shneiderman,et al.  Readings in information visualization - using vision to think , 1999 .