An online fault detection model and strategies based on SVM-grid in clouds

Online fault detection is one of the key technologies to improve the performance of cloud systems. The current data of cloud systems is to be monitored, collected and used to reflect their state. Its use can potentially help cloud managers take some timely measures before fault occurrence in clouds. Because of the complex structure and dynamic change characteristics of the clouds, existing fault detection methods suffer from the problems of low efficiency and low accuracy. In order to solve them, this work proposes an online detection model based on asystematic parameter-search method called SVM U+002D Grid, whose construction is based on a support vector machine U+0028 SVM U+0029. SVM U+002D Grid is used to optimize parameters in SVM. Proper attributes of a cloud system U+02BC s running data are selected by using Pearson correlation and principal component analysis for the model. Strategies of predicting cloud faults and updating fault sample databases are proposed to optimize the model and improve its performance. In comparison with some representative existing methods, the proposed model can achieve more efficient and accurate fault detection for cloud systems.

[1]  Ruxu Du,et al.  Fault diagnosis of stamping process based on empirical mode decomposition and learning vector quantization , 2007 .

[2]  MengChu Zhou,et al.  TTSA: An Effective Scheduling Approach for Delay Bounded Tasks in Hybrid Clouds , 2017, IEEE Transactions on Cybernetics.

[3]  Huaguang Zhang,et al.  Multilevel feature moving average ratio method for fault diagnosis of the microgrid inverter switch , 2017, IEEE/CAA Journal of Automatica Sinica.

[4]  Nedim Tutkun Minimization of operational cost for an off-grid renewable hybrid system to generate electricity in residential buildings through the SVM and the BCGA methods , 2014 .

[5]  Duncan Cramer,et al.  Quantitative Data Analysis with IBM SPSS 17, 18 & 19: A Guide for Social Scientists , 2011 .

[6]  Ahsan Arefin,et al.  Diagnosing Data Center Behavior Flow by Flow , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[7]  Yan Shan,et al.  The Application of BP Neural Network Algorithm in Optical Fiber Fault Diagnosis , 2015, 2015 14th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES).

[8]  Qingsheng Zhu,et al.  Percentile Performance Estimation of Unreliable IaaS Clouds and Their Cost-Optimal Capacity Decision , 2017, IEEE Access.

[9]  MengChu Zhou,et al.  IoT-based smart and complex systems: a guest editorial report , 2018, IEEE CAA J. Autom. Sinica.

[10]  Armando Fox,et al.  Detecting application-level failures in component-based Internet services , 2005, IEEE Transactions on Neural Networks.

[11]  MengChu Zhou,et al.  Guest Editorial Special Section on Advances and Applications of Internet of Things for Smart Automated Systems , 2016, IEEE Trans Autom. Sci. Eng..

[12]  Muttukrishnan Rajarajan,et al.  Privacy-Preserving Multi-Class Support Vector Machine for Outsourcing the Data Classification in Cloud , 2014, IEEE Transactions on Dependable and Secure Computing.

[13]  Hasmat Malik,et al.  Application of Probabilistic Neural Network in Fault Diagnosis of Wind Turbine Using FAST, TurbSim and Simulink☆ , 2015 .

[14]  Kishor S. Trivedi,et al.  Achieving and assuring high availability , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[15]  MengChu Zhou,et al.  Routing in Internet of Vehicles: A Review , 2015, IEEE Transactions on Intelligent Transportation Systems.

[16]  Fei Wang,et al.  SVM based cloud classification model using total sky images for PV power forecasting , 2015, 2015 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT).

[17]  H. Howie Huang,et al.  On Soft Error Reliability of Virtualization Infrastructure , 2016, IEEE Transactions on Computers.

[18]  Xiaowei Feng,et al.  Coupled cross-correlation neural network algorithm for principal singular triplet extraction of a cross-covariance matrix , 2016, IEEE/CAA Journal of Automatica Sinica.

[19]  Yao Zhao,et al.  An efficient adaptive failure detection mechanism for cloud platform based on volterra series , 2014 .

[20]  Jun Wei,et al.  FD4C: Automatic Fault Diagnosis Framework for Web Applications in Cloud Computing , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[21]  Haifeng Chen,et al.  Invariants Based Failure Diagnosis in Distributed Computing Systems , 2010, 2010 29th IEEE Symposium on Reliable Distributed Systems.

[22]  Haifeng Chen,et al.  Modeling and Tracking of Transaction Flow Dynamics for Fault Detection in Complex Systems , 2006, IEEE Transactions on Dependable and Secure Computing.

[23]  Paul A. S. Ward,et al.  A comparative study of pairwise regression techniques for problem determination , 2007, CASCON.

[24]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[25]  MengChu Zhou,et al.  Dynamic Cloud Task Scheduling Based on a Two-Stage Strategy , 2018, IEEE Transactions on Automation Science and Engineering.

[26]  David A. Patterson,et al.  Path-Based Failure and Evolution Management , 2004, NSDI.

[27]  Zhiling Lan,et al.  Toward Automated Anomaly Identification in Large-Scale Systems , 2010, IEEE Transactions on Parallel and Distributed Systems.

[28]  Giancarlo Fortino,et al.  A socially optimal resource and revenue sharing mechanism in cloud federations , 2015, 2015 IEEE 19th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[29]  Weiming Shen,et al.  A fault prediction method based on modified Genetic Algorithm using BP neural network algorithm , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[30]  Giancarlo Fortino,et al.  Modeling and Simulating Internet-of-Things Systems: A Hybrid Agent-Oriented Approach , 2017, Computing in Science & Engineering.

[31]  Thomas Reidemeister,et al.  Efficient Fault Detection and Diagnosis in Complex Software Systems with Information-Theoretic Monitoring , 2011, IEEE Transactions on Dependable and Secure Computing.

[32]  Xiaoyun Sun,et al.  Application of Learning Vector Quantization network in fault diagnosis of power transformer , 2009, 2009 International Conference on Mechatronics and Automation.

[33]  Huaguang Zhang,et al.  Weather prediction with multiclass support vector machines in the fault detection of photovoltaic system , 2017, IEEE/CAA Journal of Automatica Sinica.

[34]  MengChu Zhou,et al.  Toward cloud computing QoS architecture: analysis of cloud systems and cloud services , 2017, IEEE/CAA Journal of Automatica Sinica.

[35]  Archana Ganapathi,et al.  Why Do Internet Services Fail, and What Can Be Done About It? , 2002, USENIX Symposium on Internet Technologies and Systems.

[36]  June-Seok Lee,et al.  An Open-Switch Fault Detection Method and Tolerance Controls Based on SVM in a Grid-Connected T-Type Rectifier With Unity Power Factor , 2014, IEEE Transactions on Industrial Electronics.

[37]  Wei Tan,et al.  CAWSAC: Cost-Aware Workload Scheduling and Admission Control for Distributed Cloud Data Centers , 2016, IEEE Transactions on Automation Science and Engineering.

[38]  MengChu Zhou,et al.  Stochastic Modeling and Quality Evaluation of Infrastructure-as-a-Service Clouds , 2015, IEEE Transactions on Automation Science and Engineering.

[39]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[40]  Penghua Li,et al.  Fault diagnosis of analog circuit using spectrogram and LVQ neural network , 2015, The 27th Chinese Control and Decision Conference (2015 CCDC).

[41]  Li Chunlai,et al.  A Survey of Online Fault Diagnosis for PV Module Based on BP Neural Network , 2016, 2016 International Conference on Smart City and Systems Engineering (ICSCSE).

[42]  MengChu Zhou,et al.  Application-Aware Dynamic Fine-Grained Resource Provisioning in a Virtualized Cloud Data Center , 2017, IEEE Transactions on Automation Science and Engineering.

[43]  Yoshinobu Tamura,et al.  A Method of Reliability Assessment Based on Neural Network and Fault Data Clustering for Cloud with Big Data , 2015, 2015 2nd International Conference on Information Science and Security (ICISS).

[44]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[45]  Giancarlo Fortino,et al.  Integration of agent-based and Cloud Computing for the smart objects-oriented IoT , 2014, Proceedings of the 2014 IEEE 18th International Conference on Computer Supported Cooperative Work in Design (CSCWD).