The Partition Transform Algorithm of Join Query

Distributed data processing and information integration system usually involve join queries which are based on base relations of the data source. These queries could be repeatedly used so the results of each join query could be very large. Therefore how to reduce the induced traffic will affect the overall performance of the distributed system or information integration system. The paper introduces the algorithm of partition-based on query definition in order to reduce communication costs inside the distributed system. Experiments show that the algorithm can reduce traffic and cut down the number of partition. In addition, our technique can be applied to simplify the definition of materialized view and improve the self-maintenance efficiency.