Hierarchical Classification with Dynamic-Threshold SVM Ensemble for Gene Function Prediction

The paper proposes a novel hierarchical classification approach with dynamic-threshold SVM ensemble. At training phrase, hierarchical structure is explored to select suit positive and negative examples as training set in order to obtain better SVM classifiers. When predicting an unseen example, it is classified for all the label classes in a top-down way in hierarchical structure. Particulary, two strategies are proposed to determine dynamic prediction threshold for different label class, with hierarchical structure being utilized again. In four genomic data sets, experiments show that the selection policies of training set outperform existing two ones and two strategies of dynamic prediction threshold achieve better performance than the fixed thresholds.

[1]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[2]  Nello Cristianini,et al.  Kernel-Based Data Fusion and Its Application to Protein Function Prediction in Yeast , 2003, Pacific Symposium on Biocomputing.

[3]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[4]  Xin Li,et al.  Protein classification with imbalanced data , 2007, Proteins.

[5]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[6]  Stan Matwin,et al.  Learning and Evaluation in the Presence of Class Hierarchies: Application to Text Categorization , 2006, Canadian AI.

[7]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[8]  Jayme G. A. Barbedo,et al.  Automatic Genre Classification of Musical Signals , 2007, EURASIP J. Adv. Signal Process..

[9]  Daphne Koller,et al.  Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.

[10]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[11]  O. Troyanskaya,et al.  Predicting gene function in a hierarchical context with an ensemble of classifiers , 2008, Genome Biology.

[12]  Xing-Ming Zhao,et al.  Gene function prediction using labeled and unlabeled data , 2008, BMC Bioinformatics.

[13]  Juho Rousu,et al.  Kernel-Based Learning of Hierarchical Multilabel Classification Models , 2006, J. Mach. Learn. Res..

[14]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 1999, Nucleic Acids Res..

[15]  Robert E. Schapire,et al.  Hierarchical multi-label prediction of gene function , 2006, Bioinform..

[16]  Alexander Lerch,et al.  A HIERARCHICAL APPROACH TO AUTOMATIC MUSICAL GENRE CLASSIFICATION , 2003 .