论文信息 - Fostering Serendipity through Big Linked Data

Fostering Serendipity through Big Linked Data

The amount of bio-medical data available over the Web grows exponentially with time. The large volume of the currently available data makes it difficult to explore, while the velocity at which this data changes and the variety of formats in which bio-medical is published makes it difficult to access them in an integrated form. Moreover, the lack of an integrated vocabulary makes querying this data difficult. In this paper, we advocate the use of Linked Data to integrate, query and visualize big bio-medical data. As a proof of concept, we show how the constant flow of bio-medical publications can be integrated with the 7.36 billion large Linked Cancer Genome Atlas dataset (TCGA). Then, we show how we can harness the value hidden in that data by making it easy to explore within a browsing interface. We evaluate the scalability of our approach by comparing the query execution time of our system with that of FedX on Linked TCGA.

Aftab Iqbal

[1] Stefan Decker,et al. Linked cancer genome atlas database , 2013, I-SEMANTICS '13.

[2] Katja Hose,et al. FedX: Optimization Techniques for Federated Query Processing on Linked Data , 2011, SEMWEB.

[3] Manfred Hauswirth,et al. DAW: Duplicate-AWare Federated Query Processing over the Web of Data , 2013, SEMWEB.