A cost aware adaptive multiple table join evaluation in MapReduce

Nowadays, MapReduce has become an effective tool for large scale data analysis. It is naturally designed for group-by aggregation tasks rather than join operator which is common in real analysis works. The existing join methods in MapReduce may earn different performances in different cases, which makes how to choose a good join plan from a join list difficult. The current static optimization can't generate an efficient evaluation plan for a given join list. In this paper, we will introduce some custom join technologies and then propose an adaptive join plan generator for multiple join depending on both rule-based model and cost-based model considering the intermediate data.