Model-Based Thermal Anomaly Detection in Cloud Datacenters

The growing importance, large scale, and high server density of high-performance computing datacenters make them prone to strategic attacks, misconfigurations, and failures (cooling as well as computing infrastructure). Such unexpected events lead to thermal anomalies - hotspots, fugues, and coldspots - which significantly impact the total cost of operation of datacenters. A model-based thermal anomaly detection mechanism, which compares expected (obtained using heat generation and extraction models) and observed thermal maps (obtained using thermal cameras) of datacenters is proposed. In addition, a Thermal Anomaly-aware Resource Allocation (TARA) scheme is designed to create time-varying thermal fingerprints of the datacenter so to maximize the accuracy and minimize the latency of the aforementioned model-based detection. TARA significantly improves the performance of model-based anomaly detection compared to state-of-the-art resource allocation schemes.

[1]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[2]  Sarita V. Adve,et al.  The impact of technology scaling on lifetime reliability , 2004, International Conference on Dependable Systems and Networks, 2004.

[3]  Manish Parashar,et al.  Energy-efficient application-aware online provisioning for virtualized clouds and data centers , 2010, International Conference on Green Computing.

[4]  Dror G. Feitelson,et al.  The workload on parallel supercomputers: modeling the characteristics of rigid jobs , 2003, J. Parallel Distributed Comput..

[5]  Rongliang Zhou,et al.  Failure Resistant Data Center Cooling Control Through Model-Based Thermal Zone Mapping , 2012 .

[6]  John P. Kerekes,et al.  Receiver Operating Characteristic Curve Confidence Intervals and Regions , 2008, IEEE Geoscience and Remote Sensing Letters.

[7]  Dario Pompili,et al.  Self-organizing sensing infrastructure for autonomic management of green datacenters , 2011, IEEE Network.

[8]  Roger R. Schmidt MEASUREMENTS AND PREDICTIONS OF THE FLOW DISTRIBUTION THROUGH PERFORATED TILES IN RAISED-FLOOR DATA CENTERS , 2001 .

[9]  Dario Pompili,et al.  Proactive thermal management in green datacenters , 2012, The Journal of Supercomputing.

[10]  Balachander Krishnamurthy,et al.  Flash crowds and denial of service attacks: characterization and implications for CDNs and web sites , 2002, WWW.

[11]  Dario Pompili,et al.  VMAP: Proactive thermal-aware virtual machine allocation in HPC cloud datacenters , 2012, 2012 19th International Conference on High Performance Computing.

[12]  Manish Marwah,et al.  Thermal anomaly prediction in data centers , 2010, 2010 12th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems.