Toward intelligent cyber-physical systems: Algorithms, architectures, and applications

Cyber-physical systems (CPS) are the new generation of engineered systems integrated with computation and physical processes. The integration of computation, communication and control adds new capabilities to the systems being able to interact with physical world. The uncertainty in physical environment makes future CPS to be more reliant on machine learning algorithms which can learn and accumulate knowledge from historical data to support intelligent decision making. Such CPS with the incorporation of intelligence or smartness are termed as intelligent CPS which are safer, more reliable and more efficient. This thesis studies fundamental machine learning algorithms in supervised and unsupervised manners and examines new computing architecture for the development of next generation CPS. Two important applications of CPS, including smart pipeline and smart grid, are also studied in this thesis. Particularly, regarding supervised machine learning algorithms, several generative learning and discriminative learning methods are proposed to improve learning performance. For the generative learning, we build novel classification methods based on exponentially embedded families (EEF), a new probability density function (PDF) estimation method, when some of the sufficient statistics are known. For the discriminative learning, we develop an extended nearest neighbor (ENN) method to predict patterns according to the maximum gain of intra-class coherence. The new method makes a prediction in a “two-way communication” style: it considers not only who are the nearest neighbors of the test sample, but also who consider the test sample as their nearest neighbors. By exploiting the generalized class-wise statistics from all training data, the proposed ENN is able to learn from the global distribution, therefore improving pattern recognition performance and providing a powerful technique for a wide range of data analysis applications. Based on the concept of ENN, an anomaly detection method is also developed in an unsupervised manner. CPS usually have high-dimensional data, such as text, video, and other multimodal sensor data. It is necessary to reduce feature dimensions to facilitate the learning. We propose an optimal feature selection framework which aims to select feature subsets with maximum discrimination capacity. To further address the information loss issue in feature reduction, we develop a novel learning method, termed generalized PDF projection theorem (GPPT), to reconstruct the distribution in high-dimensional raw data space from the low-dimensional feature subspace. To support the distributed computations throughout the CPS, it needs a novel computing architecture to offer high-performance computing over multiple spatial and temporal scales and to support Internet of Things for machine-to-machine communications. We develop a hierarchical distributed Fog computing architecture for the next generation CPS. A prototype of such architecture for smart pipeline monitoring is implemented to verify its feasibility in real world applications. Regarding the applications, we examine false data injection detection in smart grid. False data injection is a type of malicious attack which can threaten the security of energy systems. We examine the observability of false data injection and develop statistical models to estimate underlying system states and detect false data injection attacks under different scenarios to enhance the security of power systems.

[1]  Kwang-Ho Ro,et al.  Outlier detection for high-dimensional data , 2015 .

[2]  Joydeep Ghosh,et al.  A Hierarchical Multiclassifier System for Hyperspectral Data Analysis , 2000, Multiple Classifier Systems.

[3]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[4]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[5]  Ao Tang,et al.  On state estimation with bad data detection , 2011, IEEE Conference on Decision and Control and European Control Conference.

[6]  Zhu Han,et al.  Coordinated data-injection attack and detection in the smart grid: A detailed look at enriching detection solutions , 2012, IEEE Signal Processing Magazine.

[7]  Bo Tang,et al.  Reflex-Tree: A Biologically Inspired Parallel Architecture for Future Smart Cities , 2015, 2015 44th International Conference on Parallel Processing.

[8]  M. Brandon Westover,et al.  Asymptotic Geometry of Multiple Hypothesis Testing , 2008, IEEE Transactions on Information Theory.

[9]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[10]  Lang Tong,et al.  On malicious data attacks on power system state estimation , 2010, 45th International Universities Power Engineering Conference UPEC2010.

[11]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[12]  Ana Sokolova,et al.  Information-Acquisition-as-a-Service for Cyber-Physical Cloud Computing , 2010, HotCloud.

[13]  Zuyi Li,et al.  Modeling Load Redistribution Attacks in Power Systems , 2011, IEEE Transactions on Smart Grid.

[14]  Vassilios Morellas,et al.  Robust Foreground Detection In Video Using Pixel Layers , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  N. B. Venkateswarlu,et al.  A Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis , 2011 .

[16]  Peng Ning,et al.  False data injection attacks against state estimation in electric power grids , 2011, TSEC.

[17]  Henrik Sandberg,et al.  Network-Aware Mitigation of Data Integrity Attacks on Power System State Estimation , 2012, IEEE Journal on Selected Areas in Communications.

[18]  Rong Zheng,et al.  Detecting Stealthy False Data Injection Using Machine Learning in Smart Grid , 2017, IEEE Systems Journal.

[19]  Thomas H. Morris,et al.  Modeling Cyber-Physical Vulnerability of the Smart Grid With Incomplete Information , 2013, IEEE Transactions on Smart Grid.

[20]  Louis L. Scharf,et al.  Matched subspace detectors , 1994, IEEE Trans. Signal Process..

[21]  M. Schilling Multivariate Two-Sample Tests Based on Nearest Neighbors , 1986 .

[22]  Fred C. Schweppe,et al.  Power System Static-State Estimation, Part II: Approximate Model , 1970 .

[23]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[24]  Ke Zhang,et al.  A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data , 2009, PAKDD.

[25]  Jiang Zhu,et al.  Fog Computing: A Platform for Internet of Things and Analytics , 2014, Big Data and Internet of Things.

[26]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[27]  M. Froggatt,et al.  High-spatial-resolution distributed strain measurement in optical fiber with rayleigh scatter. , 1998, Applied optics.

[28]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[29]  Robertas Alzbutas,et al.  Risk and uncertainty analysis of gas pipeline failure and gas combustion consequence , 2014, Stochastic Environmental Research and Risk Assessment.

[30]  H. Akaike A new look at the statistical model identification , 1974 .

[31]  Guido Sanguinetti,et al.  Bayesian Multitask Classification With Gaussian Process Priors , 2011, IEEE Transactions on Neural Networks.

[32]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[33]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[34]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[35]  Matthias W. Seeger,et al.  PAC-Bayesian Generalisation Error Bounds for Gaussian Process Classification , 2003, J. Mach. Learn. Res..

[36]  Wen-Long Chin,et al.  Blind False Data Injection Attack Using PCA Approximation Method in Smart Grid , 2015, IEEE Transactions on Smart Grid.

[37]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[38]  Fred C. Schweppe,et al.  Power System Static-State Estimation, Part I: Exact Model , 1970 .

[39]  Jian Tang,et al.  Enhancing Effectiveness of Outlier Detections for Low Density Patterns , 2002, PAKDD.

[40]  P. O'Shea,et al.  A High Resolution Spectral Analysis Algorithm for Power System Disturbance Monitoring , 2002, IEEE Power Engineering Review.

[41]  L. Brown Fundamentals of statistical exponential families: with applications in statistical decision theory , 1986 .

[42]  David K. Y. Yau,et al.  Integrity Attacks on Real-Time Pricing in Electric Power Grids , 2015, TSEC.

[43]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[44]  Lang Tong,et al.  Malicious Data Attacks on the Smart Grid , 2011, IEEE Transactions on Smart Grid.

[45]  Danny Coomans,et al.  Improvements to the classification performance of RDA , 1993 .

[46]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[47]  Konstantinos N. Plataniotis,et al.  Nonlinear Filtering of Non-Gaussian Noise , 1997, J. Intell. Robotic Syst..

[48]  K. R. Sawyer,et al.  A Multiple Divergence Criterion for Testing Between Separate Hypotheses , 1982 .

[49]  Kishor S. Trivedi,et al.  Combining Cloud and sensors in a smart city environment , 2012, EURASIP J. Wirel. Commun. Netw..

[50]  Todd H. Stokes,et al.  k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction , 2010, The Pharmacogenomics Journal.

[51]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[52]  Lukasz A. Kurgan,et al.  Knowledge discovery approach to automated cardiac SPECT diagnosis , 2001, Artif. Intell. Medicine.

[53]  J. Simonoff Smoothing Methods in Statistics , 1998 .

[54]  Graham J. Williams,et al.  Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives [Discussion Forum] , 2014, IEEE Computational Intelligence Magazine.

[55]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[56]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[57]  Bo Tang,et al.  A Parametric Classification Rule Based on the Exponentially Embedded Family , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[58]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[59]  H. He,et al.  A self-organizing learning array system for power quality classification based on wavelet transform , 2006, IEEE Transactions on Power Delivery.

[60]  Antonio F. Gómez-Skarmeta,et al.  Smart Lighting Solutions for Smart Cities , 2013, 2013 27th International Conference on Advanced Information Networking and Applications Workshops.

[61]  Djalma M. Falcao,et al.  Bibliography on power system state estimation (1968-1989) , 1990 .

[62]  L. Tong,et al.  Malicious Data Attacks on Smart Grid State Estimation: Attack Strategies and Countermeasures , 2010, 2010 First IEEE International Conference on Smart Grid Communications.

[63]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[64]  Hans-Peter Kriegel,et al.  Generalized Outlier Detection with Flexible Kernel Density Estimates , 2014, SDM.

[65]  Ioannis Pitas,et al.  Nonlinear Digital Filters - Principles and Applications , 1990, The Springer International Series in Engineering and Computer Science.

[66]  Frank Nielsen,et al.  Statistical exponential families: A digest with flash cards , 2009, ArXiv.

[67]  D. Horst,et al.  Evaluating interdiction of oil pipelines at river crossings using Environmental Impact Assessments , 2014 .

[68]  Haibo He,et al.  A Hierarchical Distributed Fog Computing Architecture for Big Data Analysis in Smart Cities , 2015, ASE BD&SI.

[69]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[70]  Fred C. Schweppe,et al.  Power System Static-State Estimation, Part III: Implementation , 1970 .

[71]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[72]  Andrew W. Moore,et al.  New Algorithms for Efficient High-Dimensional Nonparametric Classification , 2006, J. Mach. Learn. Res..

[73]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[74]  S. Kay Asymptotically optimal detection in unknown colored noise via autoregressive modeling , 1983 .

[75]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[76]  Dimitris N. Metaxas,et al.  Metamorphs: Deformable Shape and Appearance Models , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[78]  Anthony K. H. Tung,et al.  Mining top-n local outliers in large databases , 2001, KDD '01.

[79]  Lang Tong,et al.  On Topology Attack of a Smart Grid: Undetectable Attacks and Countermeasures , 2013, IEEE Journal on Selected Areas in Communications.

[80]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[81]  S. Kay Exponentially embedded families - new approaches to model order estimation , 2005, IEEE Transactions on Aerospace and Electronic Systems.

[82]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[83]  Yan Sun,et al.  FiberID: molecular-level secret for identification of things , 2014, 2014 IEEE International Workshop on Information Forensics and Security (WIFS).

[84]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[85]  Gerald Hefferman,et al.  Multiplexed Oil Level Meter Using a Thin Core Fiber Cladding Mode Exciter , 2015, IEEE Photonics Technology Letters.

[86]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[87]  S. Weintraub,et al.  Algebra: An Approach via Module Theory , 1992 .

[88]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[89]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[90]  Bo Tang,et al.  A Bayesian Classification Approach Using Class-Specific Features for Text Categorization , 2016, IEEE Transactions on Knowledge and Data Engineering.

[91]  Hong-Tzer Yang,et al.  A de-noising scheme for enhancing wavelet-based power quality monitoring system , 2001 .

[92]  M. Schilling Mutual and shared neighbor probabilities: finite- and infinite-dimensional results , 1986, Advances in Applied Probability.

[93]  David Madigan,et al.  On the Naive Bayes Model for Text Categorization , 2003, AISTATS.

[94]  Karl Henrik Johansson,et al.  Cyber security analysis of state estimators in electric power systems , 2010, 49th IEEE Conference on Decision and Control (CDC).

[95]  D. Fischer,et al.  Developing a communication infrastructure for the Smart Grid , 2009, 2009 IEEE Electrical Power & Energy Conference (EPEC).

[96]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[97]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[98]  Gerald Hefferman,et al.  Ultraweak intrinsic Fabry-Perot cavity array for distributed sensing. , 2015, Optics letters.

[99]  R D Zimmerman,et al.  MATPOWER: Steady-State Operations, Planning, and Analysis Tools for Power Systems Research and Education , 2011, IEEE Transactions on Power Systems.

[100]  Gerald Hefferman,et al.  Terahertz Fiber Bragg Grating for Distributed Sensing , 2015, IEEE Photonics Technology Letters.

[101]  W. Wong,et al.  Optional P\'{o}lya tree and Bayesian inference , 2010, 1010.0490.

[102]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[103]  P.M. Baggenstoss,et al.  Class-specific classifier: avoiding the curse of dimensionality , 2004, IEEE Aerospace and Electronic Systems Magazine.

[104]  Xun Xu,et al.  From cloud computing to cloud manufacturing , 2012 .

[105]  A. G. Expósito,et al.  Power system state estimation : theory and implementation , 2004 .

[106]  Paul Horton,et al.  Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier , 1997, ISMB.

[107]  Yuhong Yang Elements of Information Theory (2nd ed.). Thomas M. Cover and Joy A. Thomas , 2008 .

[108]  Rong Zheng,et al.  Stealth false data injection using independent component analysis in smart grid , 2011, 2011 IEEE International Conference on Smart Grid Communications (SmartGridComm).

[109]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[110]  Aleksandar Lazarevic,et al.  Outlier Detection with Kernel Density Functions , 2007, MLDM.

[111]  Om P. Malik,et al.  Detection and classification of power quality disturbances in noisy conditions , 2003 .

[112]  Gabriela Hug,et al.  Vulnerability Assessment of AC State Estimation With Respect to False Data Injection Cyber-Attacks , 2012, IEEE Transactions on Smart Grid.

[113]  Rong Zheng,et al.  Bad data injection in smart grid: attack and defense mechanisms , 2013, IEEE Communications Magazine.

[114]  V. A. Epanechnikov Non-Parametric Estimation of a Multivariate Probability Density , 1969 .

[115]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[116]  Seref Sagiroglu,et al.  The development of intuitive knowledge classifier and the modeling of domain dependent data , 2013, Knowl. Based Syst..

[117]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[118]  Andrea Vitaletti,et al.  Smart City: An Event Driven Architecture for Monitoring Public Spaces with Heterogeneous Sensors , 2010, 2010 Fourth International Conference on Sensor Technologies and Applications.

[119]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[120]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[121]  Michel Verleysen,et al.  Class-Specific Feature Selection for One-Against-All Multiclass SVMs , 2011, ESANN.

[122]  J. Li,et al.  Smart city and the applications , 2011, 2011 International Conference on Electronics, Communications and Control (ICECC).

[123]  M. R. Brito,et al.  Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection , 1997 .

[124]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[125]  Chong Luo,et al.  Multimedia Cloud Computing , 2011, IEEE Signal Processing Magazine.

[126]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[127]  Daniel W. Apley,et al.  Estimating the density of a conditional expectation , 2016 .

[128]  Bo Tang,et al.  Toward Optimal Feature Selection in Naive Bayes for Text Categorization , 2016, IEEE Transactions on Knowledge and Data Engineering.

[129]  Karl Henrik Johansson,et al.  On Security Indices for State Estimators in Power Networks , 2010 .

[130]  Hui Jiang,et al.  Multivariate Density Estimation by Bayesian Sequential Partitioning , 2013 .

[131]  David J. Fleet,et al.  Shared Kernel Information Embedding for Discriminative Inference , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[132]  Adnan Anwar,et al.  Vulnerabilities of Smart Grid State Estimation against False Data Injection Attack , 2014, ArXiv.

[133]  Anthony K. H. Tung,et al.  Ranking Outliers Using Symmetric Neighborhood Relationship , 2006, PAKDD.

[134]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[135]  Masahide Nakamura,et al.  Using cloud technologies for large-scale house data in smart city , 2012, 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings.

[136]  Hongming Cai,et al.  An IoT-Oriented Data Storage Framework in Cloud Computing Platform , 2014, IEEE Transactions on Industrial Informatics.

[137]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[138]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[139]  A. Monticelli,et al.  Electric power system state estimation , 2000, Proceedings of the IEEE.

[140]  Jiawei Han,et al.  Modeling hidden topics on document manifold , 2008, CIKM '08.

[141]  Krzysztof J. Cios,et al.  CLIP3: Cover learning using integer programming , 1997 .

[142]  Wu He,et al.  Internet of Things in Industries: A Survey , 2014, IEEE Transactions on Industrial Informatics.

[143]  Lang Tong,et al.  Malicious data attack on real-time electricity market , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[144]  Mehrdad Tarafdar Hagh,et al.  Improving Bad Data Detection in State Estimation of Power Systems , 2011 .