Hidden Markov models for fault detection in dynamic system

The invention is a system failure monitoring method and apparatus which learns the symptom-fault mapping directly from training data. The invention first estimates the state of the system at discrete intervals in time. A feature vector x of dimension k is estimated from sets of successive windows of sensor data. A pattern recognition component then models the instantaneous estimate of the posterior class probability given the features, p(wi |/x), 1≦i≦m. Finally, a hidden Markov model is used to take advantage of temporal context and estimate class probabilities conditioned on recent past history. In this hierarchical pattern of information flow, the time series data is transformed and mapped into a categorical representation (the fault classes) and integrated over time to enable robust decision-making.

[1]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[2]  R. Kompe,et al.  Global optimization of a neural network-hidden Markov model hybrid , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[3]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[4]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[5]  Robert S. Swarz,et al.  The theory and practice of reliable system design , 1982 .

[6]  Alan S. Willsky,et al.  A survey of design methods for failure detection in dynamic systems , 1976, Autom..

[7]  Alexander H. Waibel,et al.  Continuous Speech Recognition with the Connectionist Viterbi Training Procedure: A Summary of Recent Work , 1991, IWANN.

[8]  Alex Waibel,et al.  Consonant recognition by modular construction of large phonemic time-delay neural networks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[9]  Padhraic Smyth,et al.  On loss functions which minimize to conditional expected values and posterior proba- bilities , 1993, IEEE Trans. Inf. Theory.

[10]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[11]  A. Waibel,et al.  Connectionist Viterbi training: a new hybrid method for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[12]  Mark A. Kramer,et al.  Diagnosis using backpropagation neural networks—analysis and criticism , 1990 .

[13]  Rangasami L. Kashyap,et al.  Optimal feature selection and decision rules in classification problems with time series , 1978, IEEE Trans. Inf. Theory.

[14]  Ioannis A. Papazoglou,et al.  Markov Processes for Reliability Analyses of Large Systems , 1977, IEEE Transactions on Reliability.

[15]  Paul M. Frank,et al.  Fault diagnosis in dynamic systems using analytical and knowledge-based redundancy: A survey and some new results , 1990, Autom..

[16]  Padhraic Smyth Probability density estimation and local basis function neural networks , 1994, COLT 1994.

[17]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[18]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[19]  Fernando J. Pineda,et al.  Dynamics and architecture for neural computation , 1988, J. Complex..

[20]  Hervé Bourlard,et al.  Continuous speech recognition using multilayer perceptrons with hidden Markov models , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[21]  Richard P. Lippmann,et al.  A Comparative Study of the Practical Characteristics of Neural Network and Conventional Pattern Classifiers , 1990, NIPS 1990.

[22]  Rolf Isermann,et al.  Process fault detection based on modeling and estimation methods - A survey , 1984, Autom..

[23]  Sholom M. Weiss,et al.  An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods , 1989, IJCAI.

[24]  John S. Bridle,et al.  Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters , 1989, NIPS.

[25]  M. J. D. Powell,et al.  Restart procedures for the conjugate gradient method , 1977, Math. Program..

[26]  Ronald A. Cole,et al.  A neural-net training program based on conjugate-radient optimization , 1989 .

[27]  Randall Davis,et al.  Diagnostic Reasoning Based on Structure and Behavior , 1984, Artif. Intell..