Resource-Aware Scaling of Multi-threaded Java Applications in Multi-tenancy Scenarios

Cloud platforms are becoming more prevalent in every computational domain, particularly in e-Science. A typical scientific workload will have a long execution time or be data intensive. Providing an execution environment for these applications, which belong to different tenants, has to deal with the horizontal scaling of execution flows (i.e. threads) and an effective allocation of resources that takes into account the effective progress made by each tenant. While this is trivial for Bag-of-Tasks and embarrassingly parallel jobs, it is hard for HPC single-process multi-threaded applications because they cannot be scaled up automatically just by adding more virtual machines to execute the workload. In this paper we present MengTian, a distributed execution environment or platform capable of addressing the issues above. It encompasses several extensions to the Java execution environment, ranging from middleware to the virtual machine code and libraries. Our Java-based platform provides a Single System Image abstraction supported by a Partially Global Address Space to transparently spawn threads across a cluster of machines. It monitors progress with different levels-of-detail and accounts and restricts resource consumption. The overall goal is to redistribute resources among different JVM instances, increasing the unitary outcome of the progress vs. resource usage ratio over time.

[1]  Marian Bubak,et al.  Prediction-based auto-scaling of scientific workflows , 2011, MGC '11.

[2]  José Simão,et al.  A progress and profile-driven cloud-VM for resource-efficiency and fairness in e-science environments , 2013, SAC '13.

[3]  José Simão,et al.  QoE-JVM: An Adaptive and Resource-Aware Java Runtime for Cloud Computing , 2012, OTM Conferences.

[4]  Zhiqiang Ma,et al.  DVM: towards a datacenter-scale virtual machine , 2012, VEE '12.

[5]  Bruno Schulze,et al.  Understanding scheduling implications for scientific applications in clouds , 2011, MGC '11.

[6]  Schahram Dustdar,et al.  CloudScale: a novel middleware for building transparently scaling cloud applications , 2012, SAC '12.

[7]  Yannis Smaragdakis,et al.  J-Orchestra: Automatic Java Application Partitioning , 2002, ECOOP.

[8]  Jesús Labarta,et al.  A high‐productivity task‐based programming model for clusters , 2012, Concurr. Comput. Pract. Exp..

[9]  Michael Factor,et al.  cJVM: a single system image of a JVM on a cluster , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[10]  Jimmy Su,et al.  Automatic support for irregular computations in a high-level language , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[11]  Dilma Da Silva,et al.  Adaptive task duplication using on-line bottleneck detection for streaming applications , 2012, CF '12.

[12]  Franz J. Hauck,et al.  Component-based scalability for cloud applications , 2013, CloudDP '13.

[13]  Michael Philippsen,et al.  JavaParty – transparent remote objects in Java , 1997 .

[14]  Luís Veiga,et al.  PoliPer: policies for mobile and pervasive environments , 2004, Adaptive and Reflective Middleware.

[15]  Thomas Fahringer JavaSymphony: a system for development of locality-oriented distributed and parallel Java applications , 2000, Proceedings IEEE International Conference on Cluster Computing. CLUSTER 2000.

[16]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.