A distributed and mobile data mining system

Most of the popular data mining algorithms are designed to work for centralized data and they often do not pay attention to the resource constraints of distributed and mobile environments. In support of the third generation of data mining systems on distributed and massive data, we proposed an efficient distributed and mobile algorithm for global association rule mining, which does not need to ship all of local data to one site thereby not causing excessive network communication cost. The algorithm is implemented in PL/SQL for coupling association rule mining with relational database system, well-used in organizations and communities. The experiments show that this algorithm implemented in PL/SQL beats classic Apriori algorithm for large problem sizes, by factors ranging from 2 to more than 20, and this gap grows wider when the volume of transactions further grows up.