Semantic escience: encoding meaning in next-generation digitally enhanced science

cience is becoming increasingly dependent on data, yet traditional data technologies were not designed for the scale and heterogeneity of data in the modern world. Projects such as the Large Hadron Collider (LHC) and the Australian Square Kilometre Array Pathfinder (ASKAP) will generate petabytes of data that must be analyzed by hundreds of scientists working in multiple countries and speaking many different languages. The digital or electronic facilitation of science, or eScience [1], is now essential and becoming widespread. Clearly, data-intensive science, one component of eScience, must move beyond data warehouses and closed systems, striving instead to allow access to data to those outside the main project teams, allow for greater integration of sources, and provide interfaces to those who are expert scientists but not experts in data administration and computation. As eScience flourishes and the barriers to free and open access to data are being lowered, other, more challenging, questions are emerging, such as, " How do I use this data that I did not generate? " or " How do I use this data type, which I have never seen, with the data I use every day? " or " What should I do if I really need data from another discipline but I cannot understand its terms? " This list of questions is large and growing as data and information product use increases and as more of science comes to rely on specialized devices.