Understanding data search as a socio-technical practice

Open research data are heralded as having the potential to increase effectiveness, productivity and reproducibility in science, but little is known about the actual practices involved in data search. The socio-technical problem of locating data for reuse is often reduced to the technological dimension of designing data search systems. We combine a bibliometric study of the current academic discourse around data search with interviews with data seekers. In this article, we explore how adopting a contextual, socio-technical perspective can help to understand user practices and behaviour and ultimately help to improve the design of data discovery systems.

[1]  Christina Courtright Context in information behavior research , 2007 .

[2]  Angela P. Murillo Examining data sharing and data reuse in the dataone environment , 2014, ASIST.

[3]  Loet Leydesdorff,et al.  The Intellectual and Practical Contributions of Scientometrics to STS , 2016 .

[4]  Peter Cotroneo,et al.  Elsevier’s approach to the bioCADDIE 2016 Dataset Retrieval Challenge , 2017, Database J. Biol. Databases Curation.

[5]  Ludo Waltman,et al.  Software survey: VOSviewer, a computer program for bibliometric mapping , 2009, Scientometrics.

[6]  Rob Kling,et al.  Reconceptualizing Users as Social Actors in Information Systems Research , 2003, MIS Q..

[7]  Ayoung Yoon,et al.  Data reusers' trust development , 2017, J. Assoc. Inf. Sci. Technol..

[8]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[9]  Anupama E. Gururaj,et al.  Finding useful data across multiple biomedical data repositories using DataMed , 2017, Nature Genetics.

[10]  Andrea Scharnhorst,et al.  Walking through a library remotely - Why we need maps for collections and how KnoweScape can help us to make them? , 2015, 1503.06776.

[11]  Maarten de Rijke,et al.  DATA: SEARCH'18 - Searching Data on the Web , 2018, SIGIR.

[12]  Herbert Van de Sompel,et al.  Who uses the digital data archive? An exploratory study of DANS , 2015, ASIST.

[13]  Ayoung Yoon,et al.  Red flags in data: Learning from failed data reuse experiences , 2016, ASIST.

[14]  Renata Gonçalves Curty,et al.  Untangling data sharing and reuse in social sciences , 2016, ASIST.

[15]  Christine L. Borgman,et al.  Data, data use, and scientific inquiry: two case studies of data practices , 2012, JCDL '12.

[16]  Lucila Ohno-Machado,et al.  DATS, the data tag suite to enable discoverability of datasets , 2017, Scientific Data.

[17]  Leah A. Lievrouw,et al.  New Media and the `Pluralization of Life-Worlds' , 2001, New Media Soc..

[18]  Ann Zimmerman,et al.  New Knowledge from Old Data , 2008 .

[19]  Katy Börner,et al.  Macroscopes for Making Sense of Science , 2017, PEARC.

[20]  Carol Tenopir,et al.  Facilitating Access to Biodiversity Information: A Survey of Users’ Needs and Practices , 2014, Environmental Management.

[21]  Elizabeth Yakel,et al.  The challenges of digging data: a study of context in archaeological data reuse , 2013, JCDL '13.

[22]  Victoria Tsoukala,et al.  RECODE Policy Recommendations for Open Access to Research Data , 2015 .

[23]  Mary Anne Kennan,et al.  Research Data Management Practices: A Snapshot in Time , 2015 .

[24]  Simon Kasif,et al.  Human Centered Systems in the Perspective of Organizational and Social Informatics , 2019 .

[25]  Elena Paslaru Bontas Simperl,et al.  A Query Log Analysis of Dataset Search , 2017, ICWE.

[26]  Andrea Scharnhorst,et al.  Digital data archives as knowledge infrastructures: Mediating data sharing and reuse , 2018, J. Assoc. Inf. Sci. Technol..

[27]  David L. Hart Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact , 2017 .

[28]  Philipp Schaer,et al.  Query Expansion for Survey Question Retrieval in the Social Sciences , 2015, TPDL.

[29]  Matthew S. Mayernik,et al.  Drowning in data: digital library architecture to support scientific use of embedded sensor networks , 2007, JCDL '07.

[30]  Rodrigo Lopez,et al.  The EBI search engine: EBI search as a service—making biological data accessible for all , 2017, Nucleic Acids Res..

[31]  Jayant Madhavan,et al.  Structured Data on the Web , 2009, 2010 12th International Asia-Pacific Web Conference.

[32]  C. Borgman,et al.  If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology , 2013, PloS one.

[33]  Víctor Jesús Sosa Sosa,et al.  KESOSD: keyword search over structured data , 2012, KEYS '12.

[34]  Peter T. Darch,et al.  The durability and fragility of knowledge infrastructures: Lessons learned from astronomy , 2016, ASIST.

[35]  Henning Müller,et al.  The Parallel Distributed Image Search Engine (ParaDISE) , 2017, ArXiv.

[36]  C. Tenopir,et al.  Data Sharing by Scientists: Practices and Perceptions , 2011, PloS one.

[37]  Christine L. Borgman,et al.  Rethinking Online Monitoring Methods for Information Retrieval Systems: From Search Product to Search Process , 1996, J. Am. Soc. Inf. Sci..

[38]  Andrea Scharnhorst,et al.  Walking through a library remotely , 2015 .

[39]  Salvatore Mele,et al.  Integrating Data in the Scholarly Record: Community-Driven Digital Libraries in High-Energy Physics , 2014 .

[40]  Eric T. Meyer,et al.  Examining the Hyphen: The Value of Social Informatics for Research and Teaching , 2014 .

[41]  Noel Enyedy,et al.  Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries , 2007, International Journal on Digital Libraries.

[42]  Christine L. Borgman If Data Sharing is the Answer, What is the Question? , 2015, ERCIM News.

[43]  Paul T. Groth,et al.  Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines , 2017, J. Assoc. Inf. Sci. Technol..

[44]  Elena Paslaru Bontas Simperl,et al.  The Trials and Tribulations of Working with Structured Data: -a Study on Information Seeking Behaviour , 2017, CHI.

[45]  Christine L. Borgman,et al.  Big Data, Little Data, No Data: Scholarship in the Networked World , 2014 .

[46]  David Maier,et al.  Data Near Here: Bringing Relevant Data Closer to Scientists , 2013, Computing in Science & Engineering.

[47]  Thanassis Tiropanis,et al.  Exploiting Semantic Annotation of Content with Linked Open Data (LoD) to Improve Searching Performance in Web Repositories of Multi-disciplinary Research Data , 2015, RuSSIR.

[48]  Anthony J. G. Hey,et al.  Jim Gray on eScience: a transformed scientific method , 2009, The Fourth Paradigm.

[49]  Brigitte Mathiak,et al.  Are There Any Differences in Data Set Retrieval Compared to Well-Known Literature Retrieval? , 2015, TPDL.

[50]  Kara Kugelmeyer Curating Research Data. Volume One: Practical Strategies for Your Digital Repository (ISBN: 978-0-8389-8858-9) and Curating Research Data. Volume Two: A Handbook of Current Practice (ISBN: 978-0-8389-8862-6) , 2018, Coll. Res. Libr..

[51]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[52]  Ann Zimmerman,et al.  Not by metadata alone: the use of diverse forms of knowledge to locate data for reuse , 2007, International Journal on Digital Libraries.

[53]  Rob Kling,et al.  Human centered systems in the perspective of organizational and social informatics , 1998, CSOC.