Load balance for semantic cluster-based data integration systems

Data integration systems based on Peer-to-Peer environments have been developed to integrate dynamic, autonomous and heterogeneous data sources on the Web. Some of these systems adopt semantic approaches for clustering their data sources, reducing the search space. However, the clusters may become overloaded and traditional strategies of load balance are not suitable to semantic clusters. In this paper, we discuss limitations of load balance strategies in semantic clusters. In addition, we propose a solution for this load balance and we present some experimental results.

[1]  Beng Chin Ooi,et al.  Just-in-time query retrieval over partially indexed data on structured P2P overlays , 2008, SIGMOD Conference.

[2]  Laxmikant V. Kalé,et al.  Periodic hierarchical load balancing for large supercomputers , 2011, Int. J. High Perform. Comput. Appl..

[3]  Ana Carolina Salgado,et al.  Ontology-Based Clustering in a Peer Data Management System , 2012, Int. J. Distributed Syst. Technol..

[4]  John D. Garofalakis,et al.  Load Balancing in a Cluster-Based P2P System , 2009, 2009 Fourth Balkan Conference in Informatics.

[5]  S. N. Sivanandam,et al.  A Cluster Based Replication Architecture for Load Balancing in Peer-to-Peer Content Distribution , 2010, ArXiv.

[6]  Letizia Tanca,et al.  The ESTEEM platform: enabling P2P semantic collaboration through emerging collective knowledge , 2011, Journal of Intelligent Information Systems.

[7]  Joann J. Ordille,et al.  Data integration: the teenage years , 2006, VLDB.

[8]  Bernhard Thalheim,et al.  Towards a Theory of Refinement for Data Migration , 2011, ER.

[9]  Ying Qiao Applying a diffusive load balancing in a clustered P 2 P system , 2009 .

[10]  Peter Triantafillou,et al.  Towards High Performance Peer-to-Peer Content and Resource Sharing Systems , 2003, CIDR.

[11]  Symeon Papavassiliou,et al.  A novel load balancing mechanism for P2P networking , 2007, GridNets '07.

[12]  Rafal A. Angryk,et al.  Minimal data sets vs. synchronized data copies in a schema and data versioning system , 2011, PIKM '11.

[13]  John Shalf,et al.  Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.