论文信息 - Big Data Curation

Big Data Curation

With the emergence of data environments with growing data variety and volume, organizations need to be supported by processes and technologies that allow them to produce and maintain high-quality data facilitating data reuse, accessibility, and analysis. In contemporary data management environments, data curation infrastructures have a key role in addressing the common challenges found across many different data production and consumption environments. Recent changes in the scale of the data landscape bring major changes and new demands to data curation processes and technologies. This chapter investigates how the emerging big data landscape is defining new requirements for data curation infrastructures and how curation infrastructures are evolving to meet these challenges. Different dimensions of scaling-up data curation for big data are described, including emerging technologies, economic models, incentive models, social aspects, and supporting standards. This analysis is grounded by literature research, interviews with domain experts, surveys, and case studies and provides an overview of the state-of-the-art, future requirements and emerging trends in the field.

André Freitas | Edward Curry

[1] Jens Lehmann,et al. DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[2] Abraham Bernstein,et al. How Useful Are Natural Language Interfaces to the Semantic Web for Casual End-Users? , 2007, ISWC/ASWC.

[3] Seán O'Riain,et al. Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends , 2012, IEEE Internet Computing.

[4] Li Qin,et al. Concept-level access control for the Semantic Web , 2003, XMLSEC '03.

[5] Henry Lieberman,et al. Watch what I do: programming by demonstration , 1993 .

[6] Winston A Hide,et al. Big data: The future of biocuration , 2008, Nature.

[7] Anne E. Trefethen,et al. UK e-Science Programme: Next Generation Grid Applications , 2004, Int. J. High Perform. Comput. Appl..

[8] Linda C. Smith,et al. An Educational Program on Data Curation , 2007 .

[9] Mark Hedges,et al. Sheer curation for experimental data and provenance , 2012, JCDL '12.

[10] Benjamin M. Good,et al. Games with a scientific purpose , 2011, Genome Biology.

[11] Hugh Glaser,et al. Linked Open Government Data: Lessons from Data.gov.uk , 2012, IEEE Intelligent Systems.

[12] Brian McMahon. Interactive publications and the record of science , 2010, Inf. Serv. Use.

[13] Nandana Mihindukulasooriya,et al. Rights declaration in Linked Data , 2013, COLD.

[14] Robert Neches,et al. Access Control Policies for Semantic Networks , 2009, 2009 IEEE International Symposium on Policies for Distributed Systems and Networks.

[15] Amit P. Sheth,et al. Changing Focus on Interoperability in Information Systems:From System, Syntax, Structure to Semantics , 1999 .

[16] Diane M. Strong,et al. Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[17] Jun Zhao,et al. Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[18] Hugh D. Spence,et al. Minimum information requested in the annotation of biochemical models (MIRIAM) , 2005, Nature Biotechnology.

[19] Edward Curry,et al. Towards Expertise Modelling for Routing Data Cleaning Tasks within a Community of Knowledge Workers , 2012, ICIQ.

[20] Carole L. Palmer,et al. Foundations of Data Curation: The Pedagogy and Practice of "Purposeful Work" with Research Data , 2013 .

[21] Seán O'Riain,et al. A Semantic Best-Effort Approach for Extracting Structured Discourse Graphs from Wikipedia , 2012, WoLE@ISWC.

[22] James Cheney,et al. Curated databases , 2008, PODS.

[23] G J Williams,et al. The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[24] M. Ashburner,et al. Calling on a million minds for community annotation in WikiProteins , 2008, Genome Biology.

[25] Paul Buitelaar,et al. RelExt: A Tool for Relation Extraction from Text in Ontology Extension , 2005, SEMWEB.

[26] Jennifer Chu-Carroll,et al. Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..