An adaptive memory conscious approach for mining frequent trees: implications for multi-core architectures

We consider the problem of frequent tree mining and present algorithms targeting emerging single-chip multiprocessor (CMP) architectures. We explore algorithmic designs that improve the memory performance of such algorithms, both in terms of alleviating latency to memory as well as in terms of reducing the off-chip traffic. We then explore adaptive task-parallel and data-parallel design strategies which facilitate effective parallelization even in the presence of data and workload skew while minimizing parallelization overheads. We show that our optimized algorithms achieve orders of magnitude improvement both in run time and memory usage, when compared to state-of-the-art algorithms. Also, we show that our adaptive parallelization strategy achieves near-linear speedups on a modern dual quad-core system.