Technological advances in high-throughput techniques and efficient data gathering methods, coupled computational biology efforts, have resulted in a vast amount of life science data often available in distributed and heterogeneous repositories. These repositories contain information such as sequence and structure data, annotations for biological data, results of complex computations, genetic sequences and multiple bio-datasets. However, the heterogeneity of these data, have created a need for research in resource integration and platform independent processing of investigative queries, involving heterogeneous data sources. When processing huge amounts of data, information integration is one of the most critical issues, because it’s crucial to preserve the intrinsic semantics of all the merged data sources. This integration would allow the proper organization of data, fostering the analysis and access the information to accomplish critical tasks, such as the processing of micro-array data to study protein function and medical researches in making detailed studies of protein structures to facilitate drug design (Ignacimuthu, 2005). Furthermore, DNA micro-array research community urgently requires technology to allow up-to-date micro-array data information to be found, accessed and delivered in a secure framework (Sinnot, 2007). Several research disciplines, such as Bioinformatics, where information integration is critical, could benefit from harnessing the potential of a new approach: the Semantic Web (SW). The SW term was coined by Berners-Lee, Hendler and Lassila (2001) to describe the evolution of a Web that consisted of largely documents for humans to read towards a new paradigm that included data and information for computers to manipulate. The SW is about adding machine-understandable and machine-processable metadata to Web resource through its key-enabling technology: ontologies (Fensel, 2002). Ontologies are a formal explicit and shared specification of a conceptualization. The SW was conceived as a way to solve the need for data integration on the Web. This article expounds SAMIDI, a Semantics-based Architecture for Micro-array Information and Data Integration. The most remarkable innovation offered by SAMIDI is the use of semantics as a tool for leveraging different vocabularies and terminologies and foster integration. SAMIDI is composed of a methodology for the unification of heterogeneous data sources from the analysis of the requirements of the unified data set and a software architecture.
[1]
Calton Pu,et al.
Querying multiple bioinformatics information sources: can semantic web research help?
,
2002,
SGMD.
[2]
Alejandro Pazos Sierra,et al.
Encyclopedia of Artificial Intelligence
,
2008
.
[3]
V. Sugumaran.
The Inaugural Issue of the International Journal of Intelligent Information Technologies
,
2005
.
[4]
Chris F. Taylor,et al.
The MGED Ontology: a resource for semantics-based description of microarray experiments
,
2006,
Bioinform..
[5]
Toly Chen.
A Fuzzy-Neural Approach with Collaboration Mechanisms for Semiconductor Yield Forecasting
,
2010,
Int. J. Intell. Inf. Technol..
[6]
D. Murphy.
Gene expression studies using microarrays: principles, problems, and prospects.
,
2002,
Advances in physiology education.
[7]
N. Goodman.
Biological data becomes computer literate: new advances in bioinformatics.
,
2002,
Current opinion in biotechnology.
[8]
Frédéric Maire,et al.
MDSM: Microarray database schema matching using the Hungarian method
,
2006,
Inf. Sci..
[9]
Werner Ceusters,et al.
Ontology-Assisted Database Integration to Support Natural Language Processing and Biomedical Data-mining
,
2004,
J. Integr. Bioinform..
[10]
Tsung-Chih Lin,et al.
System Identification Based on Dynamical Training for Recurrent Interval Type-2 Fuzzy Neural Network
,
2011,
Int. J. Fuzzy Syst. Appl..
[11]
Vijay Kumar Mago,et al.
Cross-Disciplinary Applications of Artificial Intelligence and Pattern Recognition: Advancing Technologies
,
2011
.
[12]
Fabian Hemmert.
Life in the Pocket--The Ambient Life Project: Life-Like Movements in Tactile Ambient
,
2009,
Int. J. Ambient Comput. Intell..
[13]
Jacques Cohen,et al.
Computer science and bioinformatics
,
2005,
CACM.
[14]
Qingyu Zhang,et al.
Data visualization and data mining of continuous numerical and discrete nominal‐valued microarray databases for bioinformatics
,
2006
.
[15]
J. Giardina,et al.
Comparison of different microarray data analysis programs and description of a database for microarray data management.
,
2004,
DNA and cell biology.
[16]
A. F. Salam.
Semantic Supplier Contract Monitoring and Execution DSS Architecture
,
2008,
Int. J. Intell. Inf. Technol..
[17]
Richard O. Sinnott,et al.
From access and integration to mining of secure genomic data sets across the Grid
,
2007,
Future Gener. Comput. Syst..