A parallel genetic algorithm for rule discovery in large databases

This paper presents GA-PVMINER, a parallel genetic algorithm that uses the Parallel Virtual Machine (PVM) to discover rules in a database. The system uses the Michigan's approach, where each individual represents a rule. A rule has the form "if condition then prediction". GA-PVMINER is based on the concept learning framework, but it performs a generalization of the classification task, which can be called dependence modeling (sometimes also called generalized rule induction). In this task, different discovered rules can predict the value of different goal attributes in the "prediction" part of a rule, whereas in classification all discovered rules predict the value of the same goal attribute. The global population of genetic algorithm individuals is divided into several subpopulations, each assigned to a distinct processor. For each subpopulation, all the individuals represent rules with the same goal attribute in the "prediction" part of the rule. Different subpopulations evolve rules predicting different goal attributes. The system exploits both data parallelism and function parallelism.