Biomedical Knowledge Extraction Using Fuzzy Differential Profiles and Semantic Ranking

Recently, technologies such as DNA microarrays allow to generate big scale of transcriptomic data used to the aim of exploring background of genes. The analysis and the interpretation of such data requires important databases and efficient mining methods, in order to extract specific biological functions belonging to a group of genes of an expression profile. To this aim, we propose here a new approach for mining transcriptomic data combining domain knowledge and classification methods. Firstly, we propose the definition of Fuzzy Differential Gene Expression Profiles (FG-DEP) based on fuzzy classification and a differential definition between the considered biological situations. Secondly, we will use our previously defined efficient semantic similarity measure (called IntelliGO), that is applied on Gene Ontology (GO) annotation terms, for computing semantic and functional similarities between genes of resulting FG-DEP and well known genetic markers involved in the development of cancers. After that, the similarity matrices will be used to introduce a novel Functional Spectral Representation (FSR) calculated through a semantic ranking of genes regarding their similarities with the tumoral markers. The FSR representation should help expert to interpret by a new way transcriptomic data and infer new genes having similar biological functions regarding well known diseases.

[1]  Roderick Turner,et al.  c-MET expression level in primary colon cancer: a predictor of tumor invasion and lymph node metastases. , 2003, Clinical cancer research : an official journal of the American Association for Cancer Research.

[2]  Mitsugu Sekimoto,et al.  ATP11A is a novel predictive marker for metachronous metastasis of colorectal cancer. , 2009, Oncology reports.

[3]  Rachael P. Huntley,et al.  The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[4]  David Martin,et al.  GOToolBox: functional analysis of gene datasets based on Gene Ontology , 2004, Genome Biology.

[5]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[6]  Sidahmed Benabderrahmane,et al.  IntelliGO: a new vector-based semantic similarity measure including annotation origin , 2010, BMC Bioinformatics.

[7]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[8]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Qiang Yu,et al.  DACT3 is an epigenetic regulator of Wnt/beta-catenin signaling in colorectal cancer and is a therapeutic target of histone modifications. , 2008, Cancer cell.

[10]  Sidahmed Benabderrahmane,et al.  Analyse de données transcriptomiques: Modélisation floue de profils d'expression différentielle et analyse fonctionnelle , 2009, INFORSID.

[11]  Sidahmed Benabderrahmane,et al.  Ontology-based Gene Set Enrichment Analysis Using an Efficient Semantic Similarity Measure and Functional Clustering , 2012, ICWIT.

[12]  Carlos Rubio,et al.  Differential expression of Aquaporin 8 in human colonic epithelial cells and colorectal tumors , 2001, BMC Physiology.

[13]  Sidahmed Benabderrahmane,et al.  Functional classification of genes using semantic distance and fuzzy clustering approach: evaluation with reference sets and overlap analysis , 2012, Int. J. Comput. Biol. Drug Des..

[14]  Sidahmed Benabderrahmane,et al.  Ontology-based functional classification of genes: Evaluation with reference sets and overlap analysis , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).