Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node Features

For supervised learning with tabular data, decision tree ensembles produced via boosting techniques generally dominate real-world applications involving iid training/test sets. However for graph data where the iid assumption is violated due to structured relations between samples, it remains unclear how to best incorporate this structure within existing boosting pipelines. To this end, we propose a generalized framework for iterating boosting with graph propagation steps that share node/sample information across edges connecting related samples. Unlike previous efforts to integrate graph-based models with boosting, our approach is anchored in a principled meta loss function such that provable convergence can be guaranteed under relatively mild assumptions. Across a variety of non-iid graph datasets with tabular node features, our method achieves comparable or superior performance than both tabular and graph neural network models, as well as existing hybrid strategies that combine the two. Beyond producing better predictive performance than recently proposed graph models, our proposed techniques are easy to implement, computationally more efficient, and enjoy stronger theoretical guarantees (which make our results more reproducible).

[1]  Zheng Zhang,et al.  Graph Neural Networks Inspired by Classical Iterative Algorithms , 2021, ICML.

[2]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[3]  Stephan Günnemann,et al.  Predict then Propagate: Graph Neural Networks meet Personalized PageRank , 2018, ICLR.

[4]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[5]  Barbara Kaltenbacher,et al.  Iterative Solution Methods , 2015, Handbook of Mathematical Methods in Imaging.

[6]  Liudmila Prokhorenkova,et al.  Boost then Convolve: Gradient Boosting Meets Graph Neural Networks , 2021, ICLR.

[7]  Danai Koutra,et al.  Graph Neural Networks with Heterophily , 2021, AAAI.

[8]  Alexander J. Smola,et al.  Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation , 2020, NeurIPS.

[9]  L. Akoglu,et al.  Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs , 2020, NeurIPS.

[10]  Jiliang Tang,et al.  A Unified View on Graph Neural Networks as Graph Signal Denoising , 2020, CIKM.

[11]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[12]  Mehryar Mohri,et al.  AdaNet: Adaptive Structural Learning of Artificial Neural Networks , 2016, ICML.

[13]  Christopher R'e,et al.  Machine Learning on Graphs: A Model and Comprehensive Taxonomy , 2020, ArXiv.

[14]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[15]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[16]  Lingfan Yu,et al.  Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. , 2019 .

[17]  Bryan Hooi,et al.  Understanding and Resolving Performance Degradation in Deep Graph Convolutional Networks , 2020, CIKM.

[18]  Roger Wattenhofer,et al.  Should Graph Neural Networks Use Features, Edges, Or Both? , 2021, ArXiv.

[19]  Haihao Lu,et al.  Randomized Gradient Boosting Machine , 2018, SIAM J. Optim..

[20]  Vahab S. Mirrokni,et al.  Accelerating Gradient Boosting Machines , 2020, AISTATS.

[21]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[22]  Peng Cui,et al.  Interpreting and Unifying Graph Neural Networks with An Optimization Framework , 2021, WWW.

[23]  Xin Huang,et al.  TabTransformer: Tabular Data Modeling Using Contextual Embeddings , 2020, ArXiv.

[24]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[25]  Zhanxing Zhu,et al.  AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models , 2019, ICLR.

[26]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[27]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[28]  Qian Huang,et al.  Combining Label Propagation and Simple Models Out-performs Graph Neural Networks , 2020, ICLR.

[29]  Stefanie Jegelka,et al.  Generalization and Representational Limits of Graph Neural Networks , 2020, ICML.

[30]  Chong Wang,et al.  Attention-based Graph Neural Network for Semi-supervised Learning , 2018, ArXiv.

[31]  Georgios B. Giannakis,et al.  Kernel-based Inference of Functions over Graphs , 2017, ArXiv.

[32]  Eran Yahav,et al.  On the Bottleneck of Graph Neural Networks and its Practical Implications , 2021, ICLR.

[33]  Taiji Suzuki,et al.  Graph Neural Networks Exponentially Lose Expressive Power for Node Classification , 2019, ICLR.

[34]  Bingsheng He,et al.  Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training , 2019, IEEE Transactions on Parallel and Distributed Systems.

[35]  Xue Li,et al.  Understanding the Message Passing in Graph Neural Networks via Power Iteration , 2020, ArXiv.

[36]  Yong Yu,et al.  Bag of Tricks for Node Classification with Graph Neural Networks , 2021, 2103.13355.

[37]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[38]  Stephan Günnemann,et al.  Pitfalls of Graph Neural Network Evaluation , 2018, ArXiv.

[39]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[40]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[41]  Nikolas Ioannou,et al.  SnapBoost: A Heterogeneous Boosting Machine , 2020, NeurIPS.

[42]  Lars Schmidt-Thieme,et al.  Do We Really Need Deep Learning Models for Time Series Forecasting? , 2021, ArXiv.

[43]  Juan-Juan Cai,et al.  Gradient boosting for extreme quantile regression , 2021 .

[44]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[45]  Yuanqing Xia,et al.  Revisiting Graph Convolutional Network on Semi-Supervised Node Classification from an Optimization Perspective , 2020, ArXiv.