Thermal anomaly detection in datacenters

The high density of servers in datacenters generates a large amount of heat, resulting in the high possibility of thermally anomalous events, i.e. computer room air conditioner fan failure, server fan failure, and workload misconfiguration. As such anomalous events increase the cost of maintaining computing and cooling components, they need to be detected, localized, and classified for taking appropriate remedial actions. In this article, a hierarchical neural network framework is proposed to detect small- (server level) and large-scale (datacenter level) thermal anomalies. This novel framework, which is organized into two tiers, analyzes the data sensed by heterogeneous sensors such as sensors built in the servers and external sensors (Telosb). The proposed solution employs a neural network to learn about (a) the relationship among sensing values (i.e. internal, external, and fan speed) and (b) the relationship between the sensing values and workload information. Then, the bottom tier of our framework detects thermal anomalies, whereas the top tier localizes and classifies them. Our solution outperforms other anomaly-detection methods based on regression model, support vector machine, and self-organizing map, as shown by the experimental results.

[1]  Maisarah Ali,et al.  Optimization of cooling systems in data centre by Computational Fluid Dynamics model and simulation , 2009, 2009 Innovative Technologies in Intelligent Systems and Industrial Applications.

[2]  Geoffrey C. Fox,et al.  Task scheduling with ANN-based temperature prediction in a data center: a simulation-based study , 2011, Engineering with Computers.

[3]  Vennila Ramalingam,et al.  Speaker diarization using autoassociative neural networks , 2009, Eng. Appl. Artif. Intell..

[4]  Jeffrey S. Chase,et al.  Weatherman: Automated, Online and Predictive Thermal Mapping and Management for Data Centers , 2006, 2006 IEEE International Conference on Autonomic Computing.

[5]  R. Zunino,et al.  Auto-Associative Neural Techniques for Intrusion Detection Systems , 2007, 2007 IEEE International Symposium on Industrial Electronics.

[6]  John P. Kerekes,et al.  Receiver Operating Characteristic Curve Confidence Intervals and Regions , 2008, IEEE Geoscience and Remote Sensing Letters.

[7]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[8]  Emin Anarim,et al.  An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks , 2005, Expert Syst. Appl..

[9]  Boris Tovornik,et al.  Design of an auto-associative neural network by using design of experiments approach , 2008, Neural Computing and Applications.

[10]  Wanli Min,et al.  Journal of the American Statistical Association a Statistical Approach to Thermal Management of Data Centers under Steady State and System Perturbations a Statistical Approach to Thermal Management of Data Centers under Steady State and System Perturbations , 2022 .

[11]  Yixin Chen,et al.  Towards Optimal Sensor Placement for Hot Server Detection in Data Centers , 2011, 2011 31st International Conference on Distributed Computing Systems.

[12]  Wu-chun Feng,et al.  Towards efficient supercomputing: a quest for the right metric , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[13]  Steve Greenberg,et al.  Best Practices for Data Centers: Lessons Learned from Benchmarking 22 Data Centers , 2006 .

[14]  Chandrakant D. Patel,et al.  Application of Exploratory Data Analysis (EDA) Techniques to Temperature Data in a Conventional Data Center , 2007 .

[15]  Sandeep K. S. Gupta,et al.  Thermal-Aware Task Scheduling to Minimize Energy Usage of Blade Server Based Datacenters , 2006, 2006 2nd IEEE International Symposium on Dependable, Autonomic and Secure Computing.

[16]  Manish Marwah,et al.  Autonomous Detection of Thermal Anomalies in Data Centers , 2009 .

[17]  Gregor von Laszewski,et al.  Thermal aware workload scheduling with backfilling for green data centers , 2009, 2009 IEEE 28th International Performance Computing and Communications Conference.

[18]  Jun Ma,et al.  Network Anomaly Detection Using Dissimilarity-Based One-Class SVM Classifier , 2009, 2009 International Conference on Parallel Processing Workshops.

[19]  Paolo Frasconi,et al.  Learning in multilayered networks used as autoassociators , 1995, IEEE Trans. Neural Networks.