Multi-label Hierarchical Classification of Protein Functions with Artificial Immune Systems

This work proposes two versions of an Artificial Immune System (AIS) - a relatively recent computational intelligence paradigm --- for predicting protein functions described in the Gene Ontology (GO). The GO has functional classes (GO terms) specified in the form of a directed acyclic graph, which leads to a very challenging multi-label hierarchical classification problem where a protein can be assigned multiple classes (functions, GO terms) across several levels of the GO's term hierarchy. Hence, the proposed approach, called MHC-AIS (Multi-label Hierarchical Classification with an Artificial Immune System), is a sophisticated classification algorithm tailored to both multi-label and hierarchical classification. The first version of the MHC-AIS builds a global classifier to predict all classes in the application domain, whilst the second version builds a local classifier to predict each class. In both versions of the MHC-AIS the classifier is expressed as a set of IF-THEN classification rules, which have the advantage of representing comprehensible knowledge to biologist users. The two MHC-AIS versions are evaluated on a dataset of DNA-binding and ATPase proteins.

[1]  Ian Witten,et al.  Data Mining , 2000 .

[2]  Jerne Nk Towards a network theory of the immune system. , 1974 .

[3]  Nigel Chaffey,et al.  Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K. and Walter, P. Molecular biology of the cell. 4th edn. , 2003 .

[4]  James Brian Quinn,et al.  Technology in services , 1987 .

[5]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[6]  Alex Alves Freitas,et al.  Revisiting the Foundations of Artificial Immune Systems for Data Mining , 2007, IEEE Transactions on Evolutionary Computation.

[7]  Amos Bairoch,et al.  The PROSITE database , 2005, Nucleic Acids Res..

[8]  David Corne,et al.  Evolutionary Computation In Bioinformatics , 2003 .

[9]  N K Jerne,et al.  Towards a network theory of the immune system. , 1973, Annales d'immunologie.

[10]  G L Ada,et al.  The clonal-selection theory. , 1987, Scientific American.

[11]  Robert Stevens,et al.  Protein classification using ontology classification , 2006, ISMB.

[12]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[13]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[14]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[15]  Alex Alves Freitas,et al.  An Artificial Immune System for Fuzzy-Rule Induction in Data Mining , 2004, PPSN.

[16]  Ee-Peng Lim,et al.  Performance measurement framework for hierarchical text classification , 2003, J. Assoc. Inf. Sci. Technol..

[17]  F. Azuaje Artificial Immune Systems: A New Computational Intelligence Approach , 2003 .

[18]  Jonathan Timmis,et al.  Artificial immune systems - a new computational intelligence paradigm , 2002 .

[19]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .