GADGET SVM: a Gossip-bAseD sub-GradiEnT SVM solver

Distributed environments such as federated databases, wireless and sensor networks, Peer-to-Peer (P2P) networks are becoming increasingly popular and wellsuited for machine learning since they can store large quantities of data on a network. The distributed setting is complex in part because network topologies are often dynamic and data available to algorithms changes frequently. Furthermore, in many distributed scenarios (such as sensor networks) nodes may have limited resources. Distributed Data Mining (DDM, (Kargupta & Chan, 2000), (Demers et al., 2002), (Guo & (editors), 1999), (Provost, 2000)) and Machine Learning algorithms created for these settings must have high utility, use little communication cost, work on dynamic networks and be computationally efficient.

[1]  Philip K. Chan,et al.  Advances in Distributed and Parallel Knowledge Discovery , 2000 .

[2]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[3]  Igor Durdanovic,et al.  Parallel Support Vector Machines: The Cascade SVM , 2004, NIPS.

[4]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[5]  Vasant Honavar,et al.  Learning Support Vector Machines from Distributed Data Sources , 2005, AAAI.

[6]  Yike Guo,et al.  High Performance Data Mining: Scaling Algorithms, Applications and Systems , 2000 .

[7]  Zoubin Ghahramani,et al.  Proceedings of the 24th international conference on Machine learning , 2007, ICML 2007.

[8]  Huan Liu,et al.  Handling concept drifts in incremental learning with support vector machines , 1999, KDD '99.

[9]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[10]  Foster Provost,et al.  Distributed Data Mining: Scaling up and beyond , 2000 .

[11]  Pascal Frossard,et al.  Distributed SVM Applied to Image Classification , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  David Kempe,et al.  A decentralized algorithm for spectral analysis , 2004, STOC '04.

[14]  Vwani P. Roychowdhury,et al.  Distributed Parallel Support Vector Machines in Strongly Connected Networks , 2008, IEEE Transactions on Neural Networks.

[15]  Vangelis Metsis,et al.  Spam Filtering with Naive Bayes - Which Naive Bayes? , 2006, CEAS.