Searching raw datasets in data grids using ant colony optimization

A pure peer-to-peer method for (1) an efficient discovery of data in large distributed raw datasets and (2) collection of thus procured data is considered. It is based on ant colony optimization (ACO) and supports a user-specified extraction of structured metadata from raw datasets, and automatically performs aggregation of extracted metadata. The paper is focused on effective data aggregation and includes the detailed description of the modifications of the basic ACO algorithm that are needed for effective aggregation of the extracted data. Using a simulator, the method was vigorously tested on the wide set of different network topologies for different rates of data extraction and aggregation. Results of the most significant tests are included.