Dynamic Load Balancing Model: Preliminary Results for Parallel Pseudo-search Engine Indexers/Crawler Mechanisms Using MPI and Genetic Programming

Methodologies derived from Genetic Programming (GP) and Knowledge Discovery in Databases (KDD) were used in the parallel implementation of the indexer simulator to emulate the current World Wide Web (WWW) search engine indexers. This indexer followed the indexing strategies that were employed by Alta Vista and Inktomi that index each word in each Web document. The insights gained from the initial implementation of this simulator have resulted in the initial phase of the adaption of a biological model. The biological model will offer a basis for future developments associated withan integrated Pseudo-Search Engine. The basic characteristics exhibited by the model will be translated so as to develop a model of an integrated search engine using GP. The evolutionary processes exhibited by this biological model will not only provide mechanisms for the storage, processing, and retrieval of valuable information but also for Web crawlers, as well as for an advanced communication system. The current Pseudo-Search Engine Indexer, capable of organizing limited subsets of Web documents, provides a foundation for the first simulator of this model. Adaptation of the model for the refinement of the Pseudo-Search Engine establishes order in the inherent interactions between the indexer, crawler and browser mechanisms by including the social (hierarchical) structure and simulated behavior of this complex system. The simulation of behavior will engender mechanisms that are controlled and coordinated in their various levels of complexity. This unique model will also provide a foundation for an evolutionary expansion of the search engine as WWW documents continue to grow. The simulator results were generated using Message Passing Interface (MPI) on a network of SUN workstations and an IBM SP2 computer system.

[1]  Mark J. Jakiela,et al.  Generation and Classification of Structural Topologies With Genetic Algorithm Speciation , 1997 .

[2]  Abdesselam Bouzerdoum,et al.  Automatic selection of features for classification using genetic programming , 1996, 1996 Australian New Zealand Conference on Intelligent Information Systems. Proceedings. ANZIIS 96.

[3]  Lawrence Hunter,et al.  Classification using cultural co-evolution and genetic programming , 1996 .

[4]  Kurt Dirk Bettenhausen,et al.  Data-driven structured modelling of a biotechnological fed-batch fermentation by means of genetic programming , 1997 .

[5]  Gary Montague,et al.  Genetic programming: an introduction and survey of applications , 1997 .

[6]  J. Free,et al.  The social organization of honeybees , 1977 .

[7]  K. Frisch Bees: their vision, chemical senses, and language , 1950 .

[8]  Marc Snir,et al.  The Communication Software and Parallel Environment of the IBM SP2 , 1995, IBM Syst. J..

[9]  Michael J. Quinn,et al.  Designing Efficient Algorithms for Parallel Computers , 1987 .

[10]  John R. Koza,et al.  Survey of genetic algorithms and genetic programming , 1995, Proceedings of WESCON'95.

[11]  Reginald L. Walker Assessment of the web using Genetic Programming , 1999 .

[12]  Lee Spector,et al.  High-performance, parallel, stack-based genetic programming , 1996 .

[13]  Murad S. Taqqu,et al.  On the Self-Similar Nature of Ethernet Traffic , 1993, SIGCOMM.

[14]  D. Stuart,et al.  The Highly Ordered Double-Stranded RNA Genome of Bluetongue Virus Revealed by Crystallography , 1999, Cell.

[15]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[16]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1997, TNET.

[17]  Sergio Takeo Kofuji,et al.  Using reconfigurable logic to implement an active network , 2000, Computers and Their Applications.

[18]  Bastien Chopard,et al.  Parallel Genetic Programming and its Application to Trading Model Induction , 1997, Parallel Comput..

[19]  M. Jakiela,et al.  Genetic algorithm-based structural topology design with compliance and topology simplification considerations , 1996 .

[20]  Hubertus Franke,et al.  An Efficient Implementation of MPI , 1994 .

[21]  Peter S. Pacheco Parallel programming with MPI , 1996 .

[22]  G.J. Minden,et al.  A survey of active network research , 1997, IEEE Communications Magazine.

[23]  Reiko Tanese,et al.  Parallel Genetic Algorithms for a Hypercube , 1987, ICGA.

[24]  Dimitris C. Dracopoulos,et al.  Bulk Synchronous Parallelisation of Genetic Programming , 1996, PARA.

[25]  Bruce S. Davie,et al.  Computer Networks: A System Approach , 1998, IEEE Communications Magazine.

[26]  Hitoshi Iba,et al.  Evolving communicating agents based on genetic programming , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[27]  Melody Y. Ivory,et al.  Study of search engine indexing and update mechanisms: Usability implications , 2000, Computers and Their Applications.

[28]  Rajive L. Bagrodia,et al.  Process Synchronization: Design and Performance Evaluation of Distributed Algorithms , 1989, IEEE Trans. Software Eng..

[29]  Reginald L. Walker Dynamic Load Balancing Model: Preliminary Assessment of a Biological Model for a Pseudo-search Engine , 2000, IPDPS Workshops.

[30]  Ulf R. Hanebutte,et al.  Study of parallel efficiency in message passing environments , 1996 .

[31]  John R. Koza,et al.  Parallel Genetic Programming on a Network of Transputers , 1995 .