Social science data repositories in data deluge: A case study of ICPSR's workflow and practices

Purpose Owing to the recent surge of interest in the age of the data deluge, the importance of researching data infrastructures is increasing. The open archival information system (OAIS) model has been widely adopted as a framework for creating and maintaining digital repositories. Considering that OAIS is a reference model that requires customization for actual practice, this paper aims to examine how the current practices in a data repository map to the OAIS environment and functional components. Design/methodology/approach The authors conducted two focus-group sessions and one individual interview with eight employees at the world’s largest social science data repository, the Interuniversity Consortium for Political and Social Research (ICPSR). By examining their current actions (activities regarding their work responsibilities) and IT practices, they studied the barriers and challenges of archiving and curating qualitative data at ICPSR. Findings The authors observed that the OAIS model is robust and reliable in actual service processes for data curation and data archives. In addition, a data repository’s workflow resembles digital archives or even digital libraries. On the other hand, they find that the cost of preventing disclosure risk and a lack of agreement on the standards of text data files are the most apparent obstacles for data curation professionals to handle qualitative data; the maturation of data metrics seems to be a promising solution to several challenges in social science data sharing. Originality/value The authors evaluated the gap between a research data repository’s current practices and the adoption of the OAIS model. They also identified answers to questions such as how current technological infrastructure in a leading data repository such as ICPSR supports their daily operations, what the ideal technologies in those data repositories would be and the associated challenges that accompany these ideal technologies. Most importantly, they helped to prioritize challenges and barriers from the data curator’s perspective and to contribute implications of data sharing and reuse in social sciences.

[1]  Ayoung Yoon,et al.  Data reusers' trust development , 2017, J. Assoc. Inf. Sci. Technol..

[2]  Wei Jeng,et al.  A Report of Data-Intensive Capability, Institutional Support, and Data Management Practices in Social Sciences , 2016 .

[3]  Ayoung Yoon,et al.  Examination of Data Deposit Practices in Repositories with the OAIS Model , 2012 .

[4]  Lorenzo Cantoni,et al.  29. Libraries in the digital age: Technologies, innovation, shared resources and new responsibilities , 2015 .

[5]  Inna Kouper,et al.  Towards Sustainable Curation and Preservation: The SEAD Project's Data Services Approach , 2015, 2015 IEEE 11th International Conference on e-Science.

[6]  Mary Vardigan,et al.  ICPSR meets OAIS: applying the OAIS reference model to the social science archive context , 2007 .

[7]  Anne E. Trefethen,et al.  The Data Deluge: An e-Science Perspective , 2003 .

[8]  William H. Mischo,et al.  An Analysis of Data Management Plans in University of Illinois National Science Foundation Grant Proposals , 2014 .

[9]  Jian Qin,et al.  A content analysis of institutional data policies , 2011, JCDL '11.

[10]  Eleanor Mattern,et al.  From cyberbullying to well‐being: A narrative‐based participatory approach to values‐oriented design for social media , 2015, J. Assoc. Inf. Sci. Technol..

[11]  Greg Guest,et al.  Collecting Qualitative Data: A Field Manual for Applied Research , 2012 .

[12]  Renata Gonçalves Curty,et al.  Factors Influencing Research Data Reuse in the Social Sciences: An Exploratory Study , 2016, Int. J. Digit. Curation.

[13]  James L. Bingham Information Technology and the Conduct of Research , 1990 .

[14]  Daqing He,et al.  Using participatory design and visual narrative inquiry to investigate researchers' data challenges and recommendations for library research data services , 2015, Program.

[15]  Myron P. Gutmann The Data Archive Technologies Alliance: Looking towards a Common Future , 2009, IASSIST Conference.

[16]  Youngseek Kim,et al.  Institutional and individual influences on scientists' data sharing behaviors: A multilevel analysis , 2013, ASIST.

[17]  C. Tenopir,et al.  Data Sharing by Scientists: Practices and Perceptions , 2011, PloS one.

[18]  Matthew S. Mayernik,et al.  Drowning in data: digital library architecture to support scientific use of embedded sensor networks , 2007, JCDL '07.

[19]  Daqing He,et al.  Toward a conceptual framework for data sharing practices in social sciences: A profile approach , 2016, ASIST.

[20]  Erhard W. Hinrichs,et al.  The CLARIN Research Infrastructure: Resources and Tools for eHumanities Scholars , 2014, LREC.

[21]  Brian F. Lavoie The Open Archival Information System Reference Model: Introductory Guide , 2004 .

[22]  Lisa R Johnston,et al.  Approaches to Data Sharing: An Analysis of NSF Data Management Plans from a Large Research University , 2015 .

[23]  Wei Jeng,et al.  Research Transparency: A Preliminary Study of Disciplinary Conceptualisation, Drivers, Tools and Support Services , 2017, Int. J. Digit. Curation.

[24]  Benedikt Fecher,et al.  What Drives Academic Data Sharing? , 2014, PloS one.

[25]  A. Pentland,et al.  Computational Social Science , 2009, Science.