Hierarchical Classification of Gene Ontology with Learning Classifier Systems

The Gene Ontology (GO) project is a major bioinformatics initiative with the aim of standardizing the representation of gene and gene product attributes across species and databases. The classes in GO are hierarchically structured in the form of a directed acyclic graph (DAG), what makes its prediction more complex. This work proposes an adapted Learning Classifier Systems (LCS) in order to predict protein functions described in the GO format. Hence, the proposed approach, called HLCS (Hierarchical Learning Classifier System) builds a global classifier to predict all classes in the application domain and its is expressed as a set of IF-THEN classification rules, which have the advantage of representing more comprehensible knowledge. The HLCS is evaluated in four different ion-channel data sets structured in GO terms and compared with a Ant Colony Optimisation algorithm, named hAnt-Miner. In the tests realized the HLCS outperformed the hAnt-Miner in two out of four data sets.

[1]  Tim Kovacs Learning classifier systems resources , 2002, Soft Comput..

[2]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[3]  Vasant Honavar,et al.  Learning Classifiers Using Hierarchically Structured Class Taxonomies , 2005, SARA.

[4]  Ester Bernadó-Mansilla,et al.  Fuzzy-UCS: preliminary results , 2007, GECCO '07.

[5]  Adel Torkaman Rahmani,et al.  A New Architecture of XCS to Approximate Real-Valued Functions Based on High Order Polynomials Using Variable-Length GA , 2007, Third International Conference on Natural Computation (ICNC 2007).

[6]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[7]  Stan Matwin,et al.  Functional Annotation of Genes Using Hierarchical Text Categorization , 2005 .

[8]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[9]  Daniele Loiacono,et al.  Classifier systems that compute action mappings , 2007, GECCO '07.

[10]  Alex Alves Freitas,et al.  Multi-label Hierarchical Classification of Protein Functions with Artificial Immune Systems , 2008, BSB.

[11]  Martin V. Butz,et al.  Toward a theory of generalization and learning in XCS , 2004, IEEE Transactions on Evolutionary Computation.

[12]  Larry Bull,et al.  A Neural Learning Classifier System with Self-Adaptive Constructivism for Mobile Robot Control , 2006, Artificial Life.

[13]  Jean-Daniel Zucker,et al.  Abstraction, Reformulation and Approximation, 6th International Symposium, SARA 2005, Airth Castle, Scotland, UK, July 26-29, 2005, Proceedings , 2005, SARA.

[14]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[15]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[16]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[17]  Alex Alves Freitas,et al.  A Hierarchical Classification Ant Colony Algorithm for Predicting Gene Ontology Terms , 2009, EvoBIO.

[18]  R. Apweiler,et al.  On the Importance of Comprehensible Classification Models for Protein Function Prediction , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  Martin V. Butz,et al.  Introducing a Genetic Generalization Pressure to the Anticipatory Classifier System - Part 1: Theoretical approach , 2000, GECCO.

[20]  Carlos Eduardo Ferreira,et al.  Advances in Bioinformatics and Computational Biology, 5th Brazilian Symposium on Bioinformatics, BSB 2010, Rio de Janeiro, Brazil, August 31-September 3, 2010. Proceedings , 2010, BSB.

[21]  Stewart W. Wilson,et al.  Noname manuscript No. (will be inserted by the editor) Learning Classifier Systems: A Survey , 2022 .

[22]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..