Collaborative learning by boosting in distributed environments

In this paper we propose a new distributed learning method called distributed network boosting (DNB) algorithm for distributed applications. The learned hypotheses are exchanged between neighboring sites during learning process. Theoretical analysis shows that the DNB algorithm minimizes the cost function through the collaborative functional gradient descent in hypotheses space. Comparison results of the DNB algorithm with other distributed learning methods on real data sets with different sizes show its effectiveness.

[1]  Peter L. Bartlett,et al.  Functional Gradient Techniques for Combining Hypotheses , 2000 .

[2]  Nitesh V. Chawla,et al.  Learning Ensembles from Bites: A Scalable and Accurate Approach , 2004, J. Mach. Learn. Res..

[3]  Vipin Kumar,et al.  Parallel formulations of decision-tree classification algorithms , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[4]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[5]  Zoran Obradovic,et al.  The distributed boosting algorithm , 2001, KDD '01.

[6]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[7]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[8]  Vasant Honavar,et al.  A Framework for Learning from Distributed Data Using Sufficient Statistics and Its Application to Learning Decision Trees , 2004, Int. J. Hybrid Intell. Syst..

[9]  Changshui Zhang,et al.  Network Game and Boosting , 2005, ECML.

[10]  Foster J. Provost,et al.  A Survey of Methods for Scaling Up Inductive Algorithms , 1999, Data Mining and Knowledge Discovery.

[11]  Jiming Liu,et al.  Autonomy-Oriented Social Networks Modeling: Discovering the Dynamics of Emergent Structure and Performance , 2007, Int. J. Pattern Recognit. Artif. Intell..

[12]  Salvatore J. Stolfo,et al.  The application of AdaBoost for distributed, scalable and on-line learning , 1999, KDD '99.

[13]  Leo Breiman,et al.  Pasting Small Votes for Classification in Large Databases and On-Line , 1999, Machine Learning.

[14]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[15]  Vipin Kumar,et al.  Parallel Formulations of Decision-Tree Classification Algorithms , 2004, Data Mining and Knowledge Discovery.

[16]  Naonori Ueda,et al.  Generalization error of ensemble estimators , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[17]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[18]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[19]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[20]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[21]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[22]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[23]  Alex Arenas,et al.  Paths to synchronization on complex networks. , 2006, Physical review letters.

[24]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[25]  Zoran Obradovic,et al.  Boosting Algorithms for Parallel and Distributed Learning , 2022 .