Big Data Sharing Among Academics

The goal of this chapter is to explore the practice of big data sharing among academics and issues related to this sharing. The first part of the chapter reviews literature on big data sharing practices using current technology. The second part presents case studies on disciplinary data repositories in terms of their requirements and policies. It describes and compares such requirements and policies at disciplinary repositories in three areas: Dryad for life science, Interuniversity Consortium for Political and Social Research (ICPSR) for social science, and the National Oceanographic Data Center (NODC) for physical science.

[1]  Shirley Ann Becker,et al.  Effective Databases for Text & Document Management , 2003 .

[2]  Jane Greenberg Theoretical Considerations of Lifecycle Modeling: An Analysis of the Dryad Repository Demonstrating Automatic Metadata Propagation, Inheritance, and Value System Adoption , 2009 .

[3]  Vishal Bhatnagar Data Mining and Analysis in the Engineering Field , 2014 .

[4]  Winston A Hide,et al.  Big data: The future of biocuration , 2008, Nature.

[5]  Alberto Abelló,et al.  A Survey of Multidimensional Modeling Methodologies , 2009, Int. J. Data Warehous. Min..

[6]  S E Fienberg,et al.  Sharing statistical data in the biomedical and health sciences: ethical, institutional, legal, and professional dimensions. , 1994, Annual review of public health.

[7]  M. S. Avila-Garcia,et al.  A Virtual Research Environment for Cancer Imaging Research , 2011, 2011 IEEE Seventh International Conference on eScience.

[8]  Carole A. Goble,et al.  The design and realisation of the myExperiment Virtual Research Environment for social sharing of workflows , 2009, Future Gener. Comput. Syst..

[9]  Francis X. Diebold,et al.  Advances in Economics and Econometrics: “Big Data” Dynamic Factor Models for Macroeconomic Measurement and Forecasting: A Discussion of the Papers by Lucrezia Reichlin and by Mark W. Watson , 2003 .

[10]  Hamid R. Nemati,et al.  Organizational Data Mining: Leveraging Enterprise Data Resources for Optimal Performance , 2003 .

[11]  Paul Prinsloo,et al.  Here Be Dragons: Mapping Student Responsibility in Learning Analytics , 2016 .

[12]  C. Lynch Big data: How do your data grow? , 2008, Nature.

[13]  D.W. Collins,et al.  The NODC Archive Management System: archiving marine data for ocean exploration and beyond , 2005, Proceedings of OCEANS 2005 MTS/IEEE.

[14]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[15]  John S. Erickson Database Technologies: Concepts, Methodologies, Tools, and Applications (4 Volumes) , 2009, Database Technologies: Concepts, Methodologies, Tools, and Applications.

[16]  Christine L. Borgman,et al.  The conundrum of sharing research data , 2012, J. Assoc. Inf. Sci. Technol..

[17]  B. Srinivasan,et al.  Mobile Information Processing Involving Multiple Non-collaborative Sources , 2007, Int. J. Bus. Data Commun. Netw..

[18]  Heeseok Lee,et al.  Managing Organizational Hypermedia Documents: A Meta-information System , 2002, Advanced Topics in Database Research, Vol. 1.

[19]  Weiru Chen,et al.  Graph-Based Modelling of Concurrent Sequential Patterns , 2010, Int. J. Data Warehous. Min..

[20]  Hollie White,et al.  A Metadata Best Practice for a Scientific Data Repository , 2009 .

[21]  H. James Nelson,et al.  Research Review: A Systematic Literature Review on the Quality of UML Models , 2011, J. Database Manag..

[22]  G. King,et al.  Ensuring the Data-Rich Future of the Social Sciences , 2011, Science.

[23]  Xin Zhang,et al.  Using Cryptography For Privacy-Preserving Data Mining , 2008 .

[24]  Heike Neuroth,et al.  TextGrid - Virtual Research Environment for the Humanities , 2011, Int. J. Digit. Curation.

[25]  Myron P. Gutmann,et al.  The selection, appraisal, and retention of social science data , 2004, Data Sci. J..

[26]  Mary Vardigan,et al.  ICPSR meets OAIS: applying the OAIS reference model to the social science archive context , 2007 .

[27]  Gopinath Ganapathy,et al.  Implementation of Mining Techniques to Enhance Discovery in Service-Oriented Computing , 2016 .

[28]  Pradeep Kumar,et al.  A New Similarity Metric for Sequential Data , 2010, Int. J. Data Warehous. Min..

[29]  Aaron Griffiths,et al.  The Publication of Research Data: Researcher Attitudes and Behaviour , 2009, Int. J. Digit. Curation.

[30]  J. Unsworth Our Cultural Commonwealth: The report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences , 2006 .

[31]  Ann G. Green,et al.  Building partnerships among social science researchers, institution-based repositories and domain specific data archives , 2007, OCLC Syst. Serv..

[32]  Min Chen,et al.  MMIR: An Advanced Content-Based Image Retrieval System Using a Hierarchical Learning Framework , 2009 .

[33]  Clifford A. Lynch,et al.  Institutional Repositories: Essential Infrastructure For Scholarship In The Digital Age , 2003 .

[34]  Jared Lyle,et al.  Data Preservation through Data Archives , 2010, PS: Political Science & Politics.

[35]  Lincoln D. Stein,et al.  Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges , 2008, Nature Reviews Genetics.

[36]  Divyakant Agrawal,et al.  Big data and cloud computing: current state and future opportunities , 2011, EDBT/ICDT '11.

[37]  Christopher P Austin,et al.  Prepublication data sharing , 2009, Nature.

[38]  Wendy Hui Wang,et al.  Privacy-Preserving Data Sharing in Cloud Computing , 2010, Journal of Computer Science and Technology.

[39]  Robert J. Brunner,et al.  Massive datasets in astronomy , 2001 .

[40]  Myron P. Gutmann,et al.  THE SELECTION, APPRAISAL, AND RETENTION OF DIGITAL SOCIAL SCIENCE DATA , 2004 .

[41]  Keng Siau,et al.  Advanced Topics In Database Research , 2005 .

[42]  Richard C. Rockwell An Integrated Network Interface Between the Researcher and Social Science Data Resources: In Search of a Practical Vision , 1994 .

[43]  Anne E. Thessen,et al.  Data issues in the life sciences , 2011, ZooKeys.

[44]  Anthony J. G. Hey,et al.  Augmenting interoperability across scholarly repositories , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[45]  Mercè Crosas,et al.  The Dataverse Network®: An Open-Source Application for Sharing, Discovering and Preserving Data , 2011, D Lib Mag..

[46]  Bradley M. Hemminger,et al.  Scientific data repositories on the Web: An initial survey , 2010 .

[47]  Tracy R. Stewart A Case Study of an Integrated University Portal , 2007, Encyclopedia of Portal Technologies and Applications.