The Trials and Tribulations of Working with Structured Data: -a Study on Information Seeking Behaviour

Structured data such as databases, spreadsheets and web tables is becoming critical in every domain and professional role. Yet we still do not know much about how people interact with it. Our research focuses on the information seeking behaviour of people looking for new sources of structured data online, including the task context in which the data will be used, data search, and the identification of relevant datasets from a set of possible candidates. We present a mixed-methods study covering in-depth interviews with 20 participants with various professional backgrounds, supported by the analysis of search logs of a large data portal. Based on this study, we propose a framework for human structured-data interaction and discuss challenges people encounter when trying to find and assess data that helps their daily work. We provide design recommendations for data publishers and developers of online data platforms such as data catalogs and marketplaces. These recommendations highlight important questions for HCI research to improve how people engage and make use of this incredibly useful online resource.

[1]  Niklas Elmqvist,et al.  Embodied Human-Data Interaction , 2011 .

[2]  Ben Shneiderman,et al.  From Keyword Search to Exploration: Designing Future Search Interfaces for the Web , 2010, Found. Trends Web Sci..

[3]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[4]  Peter Ingwersen Information Retrieval Interaction: (click on title for download access via 'Documents') , 1992 .

[5]  P. A. Blight The Analysis of Time Series: An Introduction , 1991 .

[6]  T. D. Wilson,et al.  Review of: Kuhlthau, Carol Collier. Seeking meaning: a process approach to library and information services. 2nd. ed. Westport, CT: Libraries Unlimited, 2004 , 2004, Inf. Res..

[7]  Peter Ingwersen,et al.  Information Retrieval Interaction , 1992 .

[8]  Ahmed Patel,et al.  An analysis of web proxy logs with query distribution pattern approach for search engines , 2012, Comput. Stand. Interfaces.

[9]  T. D. Wilson,et al.  Models in information behaviour research , 1999, J. Documentation.

[10]  Raya Fidel,et al.  Human Information Interaction: An Ecological Approach to Information Behavior , 2012 .

[11]  John L. Campbell,et al.  Coding In-depth Semistructured Interviews , 2013 .

[12]  John T. Stasko,et al.  Jigsaw: Supporting Investigative Analysis through Interactive Visualization , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[13]  Jérôme Dinet,et al.  Information search activity: An overview , 2012 .

[14]  Diane Kelly,et al.  Methods for Evaluating Interactive Information Retrieval Systems with Users , 2009, Found. Trends Inf. Retr..

[15]  Stefano Spaccapietra,et al.  Visual Database Systems 3: Visual information management , 2013 .

[16]  Luanne Freund,et al.  Facilitating the discovery of open government datasets through an exploratory data search interface , 2015 .

[17]  Bernard J. Jansen,et al.  Search log analysis: What it is, what's been done, how to do it , 2006 .

[18]  Susan Dunman,et al.  Seeking meaning: A process approach to library and information services , 1996 .

[19]  Kevyn Collins-Thompson,et al.  Towards searching as a learning process: A review of current perspectives and future directions , 2016, J. Inf. Sci..

[20]  Hamed Haddadi,et al.  Human-Data Interaction , 2016 .

[21]  Michael J. Albers,et al.  Human-Information Interaction with Complex Information for Decision-Making , 2015, Informatics.

[22]  Paul Marshall,et al.  Workshop on embodied interaction: theory and practice in HCI , 2011, CHI Extended Abstracts.

[23]  Donald O. Case,et al.  Looking for Information: A Survey of Research on Information Seeking, Needs and Behavior , 2012 .

[24]  Carla Maria Dal Sasso Freitas,et al.  Navigation and Interaction in Graph Visualizations , 2008, RITA.

[25]  Rik Van de Walle,et al.  Adding Realtime Coverage to the Google Knowledge Graph , 2012, SEMWEB.

[26]  Barbara Ubaldi,et al.  Open Government Data , 2019, Government at a Glance: Latin America and the Caribbean 2020.

[27]  Jo Bates,et al.  Data journeys: Capturing the socio-material constitution of data objects and flows , 2016, Big Data Soc..

[28]  J. Knottnerus,et al.  Real world research. , 2010, Journal of clinical epidemiology.

[29]  Andy Crabtree,et al.  Human Data Interaction: Historical Lessons from Social Studies and CSCW , 2015, ECSCW.

[30]  Gary Marchionini,et al.  Accessing government statistical information , 2005, Computer.

[31]  Tiziana Catarci,et al.  What Happened When Database Researchers Met Usability , 2000, Inf. Syst..

[32]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[33]  David R. Karger,et al.  Collaborative Data Analytics with DataHub , 2015, Proc. VLDB Endow..

[34]  Lei Yang,et al.  Query log analysis of an electronic health record search engine. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[35]  Viktor Mayer-Schnberger,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .

[36]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[37]  Peter F. Patel-Schneider,et al.  Analyzing Schema.org , 2014, SEMWEB.

[38]  Ryen W. White Interactions with Search Systems , 2016 .

[39]  George W. Furnas,et al.  Making sense of sensemaking , 2005, CHI Extended Abstracts.

[40]  William N. Dilla,et al.  Interactive Data Visualization: New Directions for Accounting Information Systems Research , 2010, J. Inf. Syst..

[41]  Jayant Madhavan,et al.  Structured Data on the Web , 2009, 2010 12th International Asia-Pacific Web Conference.

[42]  Michal Munk,et al.  Enhancing database querying skills by choosing a more appropriate interface , 2010, IEEE EDUCON 2010 Conference.

[43]  Amit P. Sheth,et al.  Semantic Services, Interoperability and Web Applications - Emerging Concepts , 2011, Semantic Services, Interoperability and Web Applications.

[44]  Judith Green,et al.  Qualitative methods for health research , 2004 .

[45]  Jens Lehmann,et al.  DBpedia SPARQL Benchmark - Performance Assessment with Real Queries on Real Data , 2011, SEMWEB.

[46]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[47]  John T. Stasko,et al.  Toward a Deeper Understanding of the Role of Interaction in Information Visualization , 2007, IEEE Transactions on Visualization and Computer Graphics.

[48]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[49]  Víctor Jesús Sosa Sosa,et al.  KESOSD: keyword search over structured data , 2012, KEYS '12.

[50]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .

[51]  Stefan Schlobach,et al.  LOD Laundromat: A Uniform Way of Publishing Other People's Dirty Data , 2014, SEMWEB.

[52]  Nicholas J. Belkin,et al.  A faceted approach to conceptualizing tasks in information seeking , 2008, Inf. Process. Manag..

[53]  Yadira Espinal Viktor Mayer-Schonberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work and Think , 2013 .

[54]  Sally Jo Cunningham,et al.  A transaction log analysis of a digital library , 2000, International Journal on Digital Libraries.

[55]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[56]  Rajesh Prasad,et al.  Document Retrieval using Efficient Indexing Techniques: A Review , 2016 .

[57]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[58]  A. Bryman Integrating quantitative and qualitative research: how is it done? , 2006 .

[59]  Ido Guy,et al.  Best faces forward: a large-scale study of people search in the enterprise , 2012, CHI.

[60]  Dan R. Olsen,et al.  Interacting with Information , 1995, DSV-IS.

[61]  Jian Pei,et al.  Mining search and browse logs for web search , 2013, ACM Trans. Intell. Syst. Technol..

[62]  Ann Blandford,et al.  Revisiting exploratory search from the HCI perspective , 2010 .