Hierarchical multi-label classification with chained neural networks

In classification tasks, an object usually belongs to one class within a set of disjoint classes. In more complex tasks, an object can belong to more than one class, in what is conventionally termed multi-label classification. Moreover, there are cases in which the set of classes are organised in a hierarchical fashion, and an object must be associated to a single path in this hierarchy, defining the so-called hierarchical classification. Finally, in even more complex scenarios, the classes are organised in a hierarchical structure and the object can be associated to multiple paths of this hierarchy, defining the problem investigated in this article: hierarchical multi-label classification (HMC). We address a typical problem of HMC, which is protein function prediction, and for that we propose an approach that chains multiple neural networks, performing both local and global optimisation in order to provide the final prediction: one or multiple paths in the hierarchy of classes. We experiment with four variations of this chaining process, and we compare these strategies with the state-of-the-art HMC algorithms for protein function prediction, showing that our novel approach significantly outperforms these methods.

[1]  Saso Dzeroski,et al.  Predicting gene function using hierarchical multi-label decision tree ensembles , 2010, BMC Bioinformatics.

[2]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[3]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A genetic algorithm for Hierarchical Multi-Label Classification , 2012, SAC '12.

[4]  Hailong Zhu,et al.  Predicting protein functions using incomplete hierarchical labels , 2015, BMC Bioinformatics.

[5]  Alex Alves Freitas,et al.  Probabilistic Clustering for Hierarchical Multi-Label Classification of Protein Functions , 2013, ECML/PKDD.

[6]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Claudio Gentile,et al.  Incremental Algorithms for Hierarchical Classification , 2004, J. Mach. Learn. Res..

[8]  DuchiJohn,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011 .

[9]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[10]  Alex Alves Freitas,et al.  A grammatical evolution algorithm for generation of Hierarchical Multi-Label Classification rules , 2013, 2013 IEEE Congress on Evolutionary Computation.

[11]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[12]  Alex Alves Freitas,et al.  A hierarchical multi-label classification ant colony algorithm for protein function prediction , 2010, Memetic Comput..

[13]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[14]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[15]  Giorgio Valentini,et al.  True Path Rule Hierarchical Ensembles for Genome-Wide Gene Function Prediction , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  Juho Rousu,et al.  Kernel-Based Learning of Hierarchical Multilabel Classification Models , 2006, J. Mach. Learn. Res..

[17]  A. Mayne,et al.  Hierarchically classifying documents with multiple labels , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[18]  James T. Kwok,et al.  MultiLabel Classification on Tree- and DAG-Structured Hierarchies , 2011, ICML.

[19]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.