What Are Data? The Many Kinds of Data and Their Implications for Data Re-Use

One key feature of e-science is to encourage archiving and release of data so that they are available in digitally-processable forms for re-use almost from the point of collection. This assumes particular processes of translation by which data can be made visible in transportable and intelligible forms. It also requires mechanisms by which data quality and provenance can be trusted once "disconnected" from their producers. By analyzing the "life stages" of data in four academic projects, we show that these requirements create difficulties for disciplines where tacit knowledge and craft-like methods are deeply embedded in researchers, as well as for disciplines producing non-digital heterogeneous data or data derived from people rather than from material phenomena. While craft practices and tacit knowledges are a feature of most scientific endeavors, some disciplines currently appear more inclined to attempt to formalize or at least record these knowledges. We discuss the implications this has for the e-science objective of widespread data re-use.

[1]  Stacy Lathrop,et al.  Envisioning Global Policy in World Heritage , 2005 .

[2]  Louise Corti,et al.  Progress and Problems of Preserving and Providing Access to Qualitative Data for Social Research—The International Picture of an Emerging Culture , 2000 .

[3]  W. Hagstrom Competition in Science , 1974 .

[4]  S. Hilgartner,et al.  Data withholding in academic genetics: evidence from a national survey. , 2002, JAMA.

[5]  Paolo Missier,et al.  Performance prediction for a code with data-dependent runtimes , 2008 .

[6]  Christine L. Borgman,et al.  Disciplinary Differences in E-Research: An information Perspective , 2005 .

[7]  C. Brodsky The Discovery of Grounded Theory: Strategies for Qualitative Research , 1968 .

[8]  Harry van den Berg,et al.  Reanalyzing Qualitative Interviews from Different Angles: The Risk of Decontextualization and Other Problems of Sharing Qualitative Data , 2005 .

[9]  A. Strauss,et al.  The discovery of grounded theory: strategies for qualitative research aldine de gruyter , 1968 .

[10]  Anne E. Trefethen,et al.  The Data Deluge: An e-Science Perspective , 2003 .

[11]  R. Whitley The Intellectual and Social Organization of the Sciences (Second Edition: with new introductory chapter entitled 'Science Transformed? The Changing Nature of Knowledge Production at the End of the Twentieth Century') , 1984 .

[12]  Louise Corti,et al.  Secondary Analysis of Archived Data , 2004 .

[13]  Francine Berman,et al.  Grid Computing: Making the Global Infrastructure a Reality , 2003 .

[14]  Louise Corti,et al.  MRC Population Data Archiving and Access Project , 2002 .

[15]  Rob Kling,et al.  Not Just a Matter of Time: Field Differences and the Shaping of Electronic Media , 1999 .

[16]  C. Coopmans Making mammograms mobile: Suggestions for a sociology of data mobility , 2006 .

[17]  Liming Chen,et al.  A Proof of Concept: Provenance in a Service Oriented Architecture , 2005 .

[18]  B. Latour We Have Never Been Modern , 1991 .

[19]  Jeremy P. Birnholtz,et al.  Data at work: supporting sharing in science and engineering , 2003, GROUP.

[20]  John Bowers,et al.  The work to make a network work: studying CSCW in action , 1994, CSCW '94.

[21]  B. Glaser Doing grounded theory : issues and discussions , 1998 .

[22]  George E. Marcus,et al.  In Between, and On the Margins of, the Shining Centers on the Hill , 2005 .

[23]  Anita Herle,et al.  Museums and Shamans. A cross-cultural collaboration , 1994 .

[24]  Rob Kling,et al.  Not just a matter of time: Field differences and the shaping of electronic media in supporting scientific communication , 1999, J. Am. Soc. Inf. Sci..

[25]  Paul Thompson,et al.  Towards ethical practice in the use of archived transcripted interviews: a response , 2003 .

[26]  Andrew C. Simpson,et al.  Collaboration and Trust in Healthcare Innovation: The eDiaMoND Case Study , 2005, Computer Supported Cooperative Work (CSCW).

[27]  Jenny Fry,et al.  Coordinatiion and Crontrol of Research Practice across Scientific Fields: Implications for a Differentiatied E-Science , 2006 .

[28]  Paul T. Groth,et al.  PReServ: Provenance Recording for Services , 2005 .