Biological information will be extracted from these large and for the most part unknown knowledge, resulting in data-driven genomic, transcriptomic and epigenomic discoveries. Yet, search of relevant datasets for information discovery is limitedly supported: data describing write in code datasets square measure quite straight forward and incomplete, and not delineated by a coherant underlying metaphysics. Here, we have a tendency to show a way to overcome this limitation, by adopting associate degree write in code data looking approach that uses high-quality metaphysics information and progressive categorization technologies. Specifically, we have a tendency to developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective linguistics search and retrieval of write in code datasets. First, we have a tendency to made a linguistics mental object by beginning with ideas extracted from write in code data, matched to and enlarge on medical specialty ontologies integrated within the we have a tendency toll-established Unified Medical Language System; we prove that this reasoning technique is sound and complete. Then, we have a tendency to leveraged the linguistics mental object to semantically search write in code knowledge from arbitrary biologists’ queries; this permits properly finding additional datasets than those extracted by a strictly syntactical search, as supported by the opposite out there systems. We have a tendency to by trial and error show the relevancy of found datasets to the biologists’ queries.
[1]
Michael C. Schatz,et al.
Cloud Computing and the DNA Data Race
,
2010,
Nature Biotechnology.
[2]
ENCODEConsortium,et al.
An Integrated Encyclopedia of DNA Elements in the Human Genome
,
2012,
Nature.
[3]
Martin Kuiper,et al.
Biological knowledge management: the emerging role of the Semantic Web technologies
,
2009,
Briefings Bioinform..
[4]
C. Sheridan.
Illumina claims $1,000 genome win
,
2014,
Nature Biotechnology.
[5]
Amedeo Napoli,et al.
BioRegistry: Automatic extraction of metadata for biological database retrieval and discovery
,
2010,
Int. J. Metadata Semant. Ontologies.
[6]
Erika Check Hayden,et al.
Technology: The $1,000 genome
,
2014,
Nature.
[7]
Marco Masseroli,et al.
GenoMetric Query Language: a novel approach to large-scale genomic data management
,
2015,
Bioinform..
[8]
Huajun Chen,et al.
Semantic Web meets Integrative Biology: a survey
,
2013,
Briefings Bioinform..
[9]
Michel Dumontier,et al.
Towards quantitative measures in applied ontology
,
2012,
ArXiv.