Hyperbolic Random Forests

Hyperbolic space is becoming a popular choice for representing data due to the hierarchical structure - whether implicit or explicit - of many real-world datasets. Along with it comes a need for algorithms capable of solving fundamental tasks, such as classification, in hyperbolic space. Recently, multiple papers have investigated hyperbolic alternatives to hyperplane-based classifiers, such as logistic regression and SVMs. While effective, these approaches struggle with more complex hierarchical data. We, therefore, propose to generalize the well-known random forests to hyperbolic space. We do this by redefining the notion of a split using horospheres. Since finding the globally optimal split is computationally intractable, we find candidate horospheres through a large-margin classifier. To make hyperbolic random forests work on multi-class data and imbalanced experiments, we furthermore outline a new method for combining classes based on their lowest common ancestor and a class-balanced version of the large-margin loss. Experiments on standard and new benchmarks show that our approach outperforms both conventional random forest algorithms and recent hyperbolic classifiers.

[1]  A. Durrant,et al.  HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal Prototypes , 2023, ArXiv.

[2]  Mina Ghadimi Atigh,et al.  Hyperbolic Deep Learning in Computer Vision: A Survey , 2023, International Journal of Computer Vision.

[3]  Justin Johnson,et al.  Hyperbolic Image-Text Representations , 2023, ICML.

[4]  P. Mettes,et al.  Poincar\'e ResNet , 2023, 2303.14027.

[5]  F. Lécué,et al.  FisheyeHDK: Hyperbolic Deformable Kernel Learning for Ultra-Wide Field-of-View Image Recognition , 2022, AAAI.

[6]  Mina Ghadimi Atigh,et al.  Hyperbolic Image Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  O. Milenkovic,et al.  Provably accurate and scalable linear classifiers in hyperbolic spaces , 2022, Knowledge and Information Systems.

[8]  Sho Sonoda,et al.  Fully-Connected Network on Noncompact Symmetric Space and Ridgelet Transform based on Helgason-Fourier Analysis , 2022, ICML.

[9]  Irwin King,et al.  Hyperbolic Graph Neural Networks: A Review of Methods and Applications , 2022, ArXiv.

[10]  Lun-Wei Ku,et al.  Hyperbolic Disentangled Representation for Fine-Grained Aspect Extraction , 2021, AAAI.

[11]  Mehrtash Harandi,et al.  Kernel Methods in Hyperbolic Spaces , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  O. Milenkovic,et al.  Highly Scalable and Provably Accurate Classification in Poincaré Balls , 2021, 2021 IEEE International Conference on Data Mining (ICDM).

[13]  Pascal Mettes,et al.  Hyperbolic Busemann Learning with Ideal Prototypes , 2021, NeurIPS.

[14]  Christopher R'e,et al.  HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections , 2021, ICML.

[15]  Maksims Volkovs,et al.  HGCF: Hyperbolic Graph Convolution Networks for Collaborative Filtering , 2021, WWW.

[16]  Philip S. Yu,et al.  Hyperbolic Variational Graph Neural Network for Modeling Dynamic Graphs , 2021, AAAI.

[17]  Guoying Zhao,et al.  Hyperbolic Deep Neural Networks: A Survey , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Albert Gu,et al.  From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering , 2020, NeurIPS.

[19]  Yu-Gang Jiang,et al.  Hyperbolic Visual Embedding Learning for Zero-Shot Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Lorenzo Rosasco,et al.  Hyperbolic Manifold Regression , 2020, AISTATS.

[21]  Sanjiv Kumar,et al.  Robust Large-Margin Learning in Hyperbolic Space , 2020, NeurIPS.

[22]  Ponnuthurai N. Suganthan,et al.  Heterogeneous oblique random forest , 2020, Pattern Recognit..

[23]  Christopher De Sa,et al.  Differentiating through the Fréchet Mean , 2020, ICML.

[24]  Douwe Kiela,et al.  Hyperbolic Graph Neural Networks , 2019, NeurIPS.

[25]  Jure Leskovec,et al.  Hyperbolic Graph Convolutional Neural Networks , 2019, NeurIPS.

[26]  Andrew McCallum,et al.  Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space , 2019, KDD.

[27]  David Lopez-Paz,et al.  Poincaré maps for analyzing complex hierarchies in single-cell data , 2019, Nature Communications.

[28]  Valentin Khrulkov,et al.  Hyperbolic Image Embeddings , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Gary Bécigneul,et al.  Poincaré GloVe: Hyperbolic Word Embeddings , 2018, ICLR.

[31]  Douwe Kiela,et al.  Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry , 2018, ICML.

[32]  Bonnie Berger,et al.  Large-Margin Classification in Hyperbolic Space , 2018, AISTATS.

[33]  Thomas Hofmann,et al.  Hyperbolic Neural Networks , 2018, NeurIPS.

[34]  Thomas Hofmann,et al.  Hyperbolic Entailment Cones for Learning Hierarchical Embeddings , 2018, ICML.

[35]  Ponnuthurai N. Suganthan,et al.  Enhancing Multi-Class Classification of Random Forest using Random Vector Functional Neural Network and Oblique Decision Surfaces , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[36]  Marc Peter Deisenroth,et al.  Neural Embeddings of Graphs in Hyperbolic Space , 2017, ArXiv.

[37]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[38]  Ullrich Köthe,et al.  On Oblique Random Forests , 2011, ECML/PKDD.

[39]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[40]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[41]  Chandrika Kamath,et al.  Inducing oblique decision trees with evolutionary algorithms , 2003, IEEE Trans. Evol. Comput..

[42]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[43]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[44]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[45]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[46]  B. Vemuri,et al.  Horocycle Decision Boundaries for Large Margin Classification in Hyperbolic Space , 2023, ArXiv.

[47]  P. Mettes,et al.  Poincaré ResNet , 2023, ArXiv.

[48]  G. Varoquaux,et al.  Why do tree-based models still outperform deep learning on typical tabular data? , 2022, NeurIPS.

[49]  Peter Brusilovsky,et al.  Collaborative Filtering , 2014, Encyclopedia of Social Network Analysis and Mining.

[50]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[51]  Thanh-Nghi Do,et al.  Classifying Very-High-Dimensional Data with Random Forests of Oblique Decision Trees , 2009, EGC.

[52]  Simon Kasif,et al.  Induction of Oblique Decision Trees , 1993, IJCAI.

[53]  Herbert Busemann,et al.  The geometry of geodesics , 1955 .