论文信息 - PRISM: Fine-Grained Phase and Resource Information-aware Scheduler for Map-Reduce

PRISM: Fine-Grained Phase and Resource Information-aware Scheduler for Map-Reduce

In recent years, Map Reduce has become a popular model with regard to data-intensive computation. Map Reduce can significantly reduce the execution time of data-intensive jobs. In order to achieve this objective, Map Reduce breaks down each job into small map and reduce tasks and executes them in parallel across a large number of machines. However, existing solutions mainly focus on scheduling at the task-level, which offer sub-optimal job performance, because tasks may have resource requirements which may vary during their lifetime. This makes it difficult for existing system’s task-level schedulers to effectively utilize available resources in order to reduce job execution time. To avoid this limitation, PRISM is introduced. PRISM stands for Phase and Resource Information-aware Scheduler for Map-Reduce. PRISM consists of various clusters that perform resource-aware scheduling at the level of phases. PRISM can be defined as a fine-grained resource-aware Map Reduce scheduler that divides tasks into phases. Here, each phase has a constant resource usage profile, so that not a single phase suffers from starvation. PRISM also offers high resource utilization and provides 1:3x improvements in job running time as compared to the current Hadoop schedulers.

B. M. Patil | Swati R. Mahendrakar

[1] Mohamed Faten Zhani,et al. PRISM: Fine-Grained Resource-Aware Scheduling for MapReduce , 2015, IEEE Transactions on Cloud Computing.

[2] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3] Roy H. Campbell,et al. Resource Provisioning Framework for MapReduce Jobs with Performance Goals , 2011, Middleware.

[4] Liang Dong,et al. Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.

[5] Joseph M. Hellerstein,et al. MapReduce Online , 2010, NSDI.

[6] Hairong Kuang,et al. The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[7] Benjamin Hindman,et al. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[8] Jordi Torres,et al. Resource-Aware Adaptive Scheduling for MapReduce Clusters , 2011, Middleware.

[9] Andrew V. Goldberg,et al. Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.