Anomaly Detection for Aviation Safety Based on an Improved KPCA Algorithm

Thousands of flights datasets should be analyzed per day for a moderate sized fleet; therefore, flight datasets are very large. In this paper, an improved kernel principal component analysis (KPCA) method is proposed to search for signatures of anomalies in flight datasets through the squared prediction error statistics, in which the number of principal components and the confidence for the confidence limit are automatically determined by OpenMP-based -fold cross-validation algorithm and the parameter in the radial basis function (RBF) is optimized by GPU-based kernel learning method. Performed on Nvidia GeForce GTX 660, the computation of the proposed GPU-based RBF parameter is 112.9 times (average 82.6 times) faster than that of sequential CPU task execution. The OpenMP-based -fold cross-validation process for training KPCA anomaly detection model becomes 2.4 times (average 1.5 times) faster than that of sequential CPU task execution. Experiments show that the proposed approach can effectively detect the anomalies with the accuracy of 93.57% and false positive alarm rate of 1.11%.

[1]  S. X. Yang,et al.  An Adaptive Approach Based on KPCA and SVM for Real-Time Fault Diagnosis of HVCBs , 2011, IEEE Transactions on Power Delivery.

[2]  Ryan Mackey,et al.  General Purpose Data-Driven System Monitoring for Space Operations , 2009 .

[3]  C. Yoo,et al.  Nonlinear process monitoring using kernel principal component analysis , 2004 .

[4]  David J. Brown,et al.  A Two-Phase Method of Detecting Abnormalities in Aircraft Flight Data and Ranking Their Impact on Individual Flights , 2012, IEEE Transactions on Intelligent Transportation Systems.

[5]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[6]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[7]  Chi Ma,et al.  Fault diagnosis of nonlinear processes using multiscale KPCA and multiscale KPLS , 2011 .

[8]  In-Beum Lee,et al.  Fault identification for process monitoring using kernel principal component analysis , 2005 .

[9]  Stephen D. Bay,et al.  Mining distance-based outliers in near linear time with randomization and a simple pruning rule , 2003, KDD '03.

[10]  Chin-Teng Lin,et al.  An automatic method for selecting the parameter of the RBF kernel function to support vector machines , 2010, 2010 IEEE International Geoscience and Remote Sensing Symposium.

[11]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[12]  Conrad Sanderson,et al.  Armadillo: An Open Source C++ Linear Algebra Library for Fast Prototyping and Computationally Intensive Experiments , 2010 .

[13]  Ashok N. Srivastava,et al.  Multiple kernel learning for heterogeneous anomaly detection: algorithm and aviation safety case study , 2010, KDD.

[14]  Ryan Mackey,et al.  General Purpose Data-Driven Monitoring for Space Operations , 2012, J. Aerosp. Comput. Inf. Commun..

[15]  Byeng D. Youn,et al.  Ensemble of Data-Driven Prognostic Algorithms with Weight Optimization and K-Fold Cross Validation , 2010 .

[16]  Aleksandra Pizurica,et al.  A fast iterative kernel PCA feature extraction for hyperspectral images , 2010, 2010 IEEE International Conference on Image Processing.

[17]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[18]  Jiusheng Chen,et al.  An adaptive KPCA approach for detecting LDoS attack , 2017, Int. J. Commun. Syst..

[19]  Haixia Xu,et al.  Adaptive Kernel Principal Analysis for Online Feature Extraction , 2009 .

[20]  William B. March,et al.  MLPACK: a scalable C++ machine learning library , 2012, J. Mach. Learn. Res..