A peer-to-peer search in data grids based on ant colony optimization

A method for (1) an efficient discovery of data in large distributed raw datasets and (2) collection of thus procured data is considered. It is a pure peer-to-peer method without any centralized control and is therefore primarily intended for a large-scale, dynamic (data)grid environments. It provides a simple but highly efficient mechanism for keeping the load it causes under control and proves especially usefull if data discovery and collection is to be performed simultaneoulsy with dataset generation. The method supports a user-specified extraction of structured metadata from raw datasets, and automatically performs aggregation of extracted metadata. It is based on the principle of ant colony optimization (ACO). The paper is focused on effective data aggregation and includes the detailed description of the modifications of the basic ACO algorithm that are needed for effective aggregation of the extracted data. Using a simulator, the method was vigorously tested on the wide set of different network topologies for different rates of data extraction and aggregation. Results of the most significant tests are included.