Fuzzy Clustering and Gene Ontology Based Decision Rules for Identification and Description of Gene Groups

The paper presents results of the research verifying whether gene clustering that takes under consideration both gene expression values and similarity of GO terms improves a quality of rule-based description of the gene groups. The obtained results show that application of the Conditional Robust Fuzzy C-Medoids algorithm enables to obtain gene groups similar to the groups determined by domain experts. However, the differences observed in clustering influences a description quality of the groups. The rules determined cover more genes retaining their statistical significance. The rules induction and post-processing method presented in the paper takes under consideration, among others, a hierarchy of GO terms and a compound measure that evaluates the generated rules. The approach presented is unique, it makes possible to limit a number of rules determined considerably and to obtain rules that reflect varied biological knowledge even if they cover the same genes.

[1]  Andrzej Skowron,et al.  Transactions on Rough Sets IV , 2005, Trans. Rough Sets.

[2]  J. Stefanowski,et al.  Induction of decision rules in classification and discovery‐oriented perspectives , 2001 .

[3]  Olivier Bodenreider,et al.  Ontology-driven similarity approaches to supporting gene func- tional assessment , 2005 .

[4]  Witold Pedrycz,et al.  P-FCM: a proximity-based fuzzy clustering for user-centered web applications , 2003, Int. J. Approx. Reason..

[5]  D. Botstein,et al.  The transcriptional program in the response of human fibroblasts to serum. , 1999, Science.

[6]  Witold Pedrycz,et al.  Conditional Fuzzy C-Means , 1996, Pattern Recognit. Lett..

[7]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[8]  Rafal Kustra,et al.  Incorporating Gene Ontology in Clustering Gene Expression Data , 2006, 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06).

[9]  Jacek M. Leski,et al.  A new artificial neural network based fuzzy inference system with moving consequents in if-then rules and selected applications , 1999, Fuzzy Sets Syst..

[10]  Anupam Joshi,et al.  Low-complexity fuzzy relational clustering algorithms for Web mining , 2001, IEEE Trans. Fuzzy Syst..

[11]  Sadaaki Miyamoto,et al.  Rough Sets and Current Trends in Computing , 2012, Lecture Notes in Computer Science.

[12]  Jerzy W. Grzymala-Busse,et al.  Data mining based on rough sets , 2003 .

[13]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Herman Midelfart Supervised Learning in the Gene Ontology Part I: A Rough Set Framework , 2005, Trans. Rough Sets.

[15]  John Wang,et al.  Data Mining: Opportunities and Challenges , 2003 .

[16]  José María Carazo,et al.  Integrated analysis of gene expression by association rules discovery , 2006, BMC Bioinformatics.

[17]  Nick Cercone,et al.  Rule Quality Measures for Rule Induction Systems: Description and Evaluation , 2001, Comput. Intell..

[18]  Marek Sikora,et al.  Rule Quality Measures in Creation and Reduction of Data Rule Models , 2006, RSCTC.