MapReduce query processing systems translate a query statement into a query plan, consisting of a set of MapReduce jobs to be executed in distributed machines. During query translation, these query systems uniformly allocate computing resources to each job by delegating the same tuning to the entire query plan. However, jobs may implement their own collection of operators, which lead to different usage of computing resources. In this paper we propose an adaptive tuning mechanism that enables setting specific resources to each job within a query plan. Our adaptive mechanism relies on a data structure that maps jobs to tuning codes by analyzing source code and log files. This adaptive mechanism allows delegating specific resources to the query plan at runtime as the data structure hosts specific pre-computed tuning codes.
[1]
Guanying Wang,et al.
Using realistic simulation for performance analysis of mapreduce setups
,
2009,
LSAP '09.
[2]
Scott Shenker,et al.
Shark: SQL and rich analytics at scale
,
2012,
SIGMOD '13.
[3]
Pete Wyckoff,et al.
Hive - A Warehousing Solution Over a Map-Reduce Framework
,
2009,
Proc. VLDB Endow..
[4]
Kushal Datta,et al.
Gunther: Search-Based Auto-Tuning of MapReduce
,
2013,
Euro-Par.
[5]
Liang Dong,et al.
Starfish: A Self-tuning System for Big Data Analytics
,
2011,
CIDR.