Decentralized Distributed Computing System for Privacy-Preserving Combined Classifiers - Modeling and Optimization

The growing amount of various kinds of information triggers the need to develop efficient network computing systems, as single machines in many cases are not able to provide effective processing and analysis. One of the very promising approaches of distributed data analysis is combined classification, which could be relatively easily implemented in distributed computing systems. In this paper we address problem of decentralized distributed computing system for mentioned above classification method. We focus on the system fairness. The performance metric is defined as a maximum response time, i.e., the computing system should be designed to minimize the response time of each client using the system. We assume that the system is decentralized and each request is sent by the client directly to computing nodes without assistance of a central service. An ILP (Integer Linear Programming) model is formulated and applied to obtain optimal results provided by branch-and-cut algorithm included in the CPLEX solver. Widespread simulations are performed to evaluate properties of the computing system in terms of several parameters describing the system.

[1]  Michal Wozniak,et al.  Optimizing distributed computing systems for k-nearest neighbours classifiers - evolutionary approach , 2011, Log. J. IGPL.

[2]  Jarek Nabrzyski,et al.  Grid resource management: state of the art and future trends , 2004 .

[3]  Xuemin Shen,et al.  Handbook of Peer-to-Peer Networking , 2009 .

[4]  Günther R. Raidl,et al.  The Multidimensional Knapsack Problem: Structure and Algorithms , 2010, INFORMS J. Comput..

[5]  Eng Keong Lua,et al.  P2p Networking And Applications , 2009 .

[6]  Fabio Roli,et al.  Bayesian Analysis of Linear Combiners , 2007, MCS.

[7]  Robert P. W. Duin,et al.  The combining classifier: to train or not to train? , 2002, Object recognition supported by user interaction for service robots.

[8]  A.C. Campilho,et al.  Combining independent and unbiased classifiers using weighted average , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[9]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[11]  Yehuda Lindell,et al.  Secure Multiparty Computation for Privacy-Preserving Data Mining , 2009, IACR Cryptol. ePrint Arch..

[12]  Ian J. Taylor From P2P to Web Services and Grids - Peers in a Client/Server World , 2005, Computer Communications and Networks.

[13]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[14]  Philip S. Yu,et al.  Privacy-Preserving Data Mining - Models and Algorithms , 2008, Advances in Database Systems.

[15]  Deep Medhi,et al.  Routing, flow, and capacity design in communication and computer networks , 2004 .

[16]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.

[17]  D. Milojicic,et al.  Peer-to-Peer Computing , 2010 .

[18]  Franco Travostino,et al.  Grid networks : enabling grids with advanced communication technology , 2006 .

[19]  Sasu Tarkoma Overlay Networks - Toward Information Networking , 2010 .

[20]  Michal Wozniak,et al.  Optimization of overlay distributed computing systems for multiple classifier system - heuristic approach , 2012, Log. J. IGPL.

[21]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[22]  George Kesidis,et al.  Decision Aggregation in Distributed Classification by a Transductive Extension of Maximum Entropy/Improved Iterative Scaling , 2008, EURASIP J. Adv. Signal Process..