Self-adaptive management of the sleep depths of idle nodes in large scale systems to balance between energy consumption and response times

Due to the time-varying nature of real workload, a large scale computer system has quite a number of idle nodes in most time of operation. They consume energy, but do nothing useful. To save the huge energy waste caused by such active idle nodes, most modern compute nodes provide multiple level dynamic sleep mechanisms to reduce power consumption. However, awaking sleeping nodes takes time, thus affects the response times and performance of the system. A node is deeper in sleep, it consumes less energy, but has longer wakeup latency. This paper proposes a sleep state management model to balance the system's energy consumption and response times. In this model, idle nodes are classified into different groups according to their sleep states. Each group contains nodes of same level of sleep depth and forms a reserve pool of a certain readiness level. In a resource allocation process, nodes in the pool of highest level of readiness are preferentially provided to the application. When the nodes in the pool of the highest readiness level are not sufficient, the nodes in the pool(s) of next level(s) of readiness are allocated. After each allocation and reclaim of nodes, the numbers of nodes in each level of pools are adjusted by changing the sleep depth of the nodes up and down. Thus, the reserve pools can be maintained at all times. Obviously, a key factor that affects the effectiveness of the idle node management is the sizes of the reserve pools. This paper proposes and investigates a self-adaptive approach to this problem so that the sizes of reserve pools are dynamically adjusted according to the applications. Our experiments demonstrated that, by applying our self-adaptive management, the power consumption of idle nodes can be reduced by 84.12% with the cost of slowdown rate being only 8.85%.