A Selection Method of Join Strategy in Column Store Based Query

The join strategy optimization among columns is the most important problem in column store based queries. Current column-oriented systems use single join strategy through a whole query, with less optimization, so the performance remains dissatisfactory. A selection method of join strategy is presented. Firstly, the query plans with too much cost is removed by defining several simple rules, and the candidate query plan tree is obtained. Then the dynamic optimization algorithm is proposed to improve the candidate query plan under the principle of Huffman tree. According to the column-oriented data storage characteristics, the join execution for each join node in the plan can be summarized into two ways: Pipeline strategy and parallel strategy. A cost model is then proposed to select the optimal strategy by estimating the cost of the pipeline and parallel strategies with low time complexity.