In recent years, Map Reduce has become a popular model with regard to data-intensive computation. Map Reduce can significantly reduce the execution time of data-intensive jobs. In order to achieve this objective, Map Reduce breaks down each job into small map and reduce tasks and executes them in parallel across a large number of machines. However, existing solutions mainly focus on scheduling at the task-level, which offer sub-optimal job performance, because tasks may have resource requirements which may vary during their lifetime. This makes it difficult for existing system’s task-level schedulers to effectively utilize available resources in order to reduce job execution time. To avoid this limitation, PRISM is introduced. PRISM stands for Phase and Resource Information-aware Scheduler for Map-Reduce. PRISM consists of various clusters that perform resource-aware scheduling at the level of phases. PRISM can be defined as a fine-grained resource-aware Map Reduce scheduler that divides tasks into phases. Here, each phase has a constant resource usage profile, so that not a single phase suffers from starvation. PRISM also offers high resource utilization and provides 1:3x improvements in job running time as compared to the current Hadoop schedulers.
[1]
Mohamed Faten Zhani,et al.
PRISM: Fine-Grained Resource-Aware Scheduling for MapReduce
,
2015,
IEEE Transactions on Cloud Computing.
[2]
Sanjay Ghemawat,et al.
MapReduce: Simplified Data Processing on Large Clusters
,
2004,
OSDI.
[3]
Roy H. Campbell,et al.
Resource Provisioning Framework for MapReduce Jobs with Performance Goals
,
2011,
Middleware.
[4]
Liang Dong,et al.
Starfish: A Self-tuning System for Big Data Analytics
,
2011,
CIDR.
[5]
Joseph M. Hellerstein,et al.
MapReduce Online
,
2010,
NSDI.
[6]
Hairong Kuang,et al.
The Hadoop Distributed File System
,
2010,
2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).
[7]
Benjamin Hindman,et al.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types
,
2011,
NSDI.
[8]
Jordi Torres,et al.
Resource-Aware Adaptive Scheduling for MapReduce Clusters
,
2011,
Middleware.
[9]
Andrew V. Goldberg,et al.
Quincy: fair scheduling for distributed computing clusters
,
2009,
SOSP '09.