An investigation of feedforward neural networks with respect to the detection of spurious patterns

This thesis investigates feedforward neural networks in the context of classi cation tasks with respect to the detection of patterns that do not belong to the same categories of patterns used to train the network. This refers to the problem of the detection and/or rejection of spurious or novel patterns. In particular, the multilayer perceptron network (MLP) trained with the backpropagation algorithm is examined in this respect and di erent strategies for improving its performance in the detection of spurious patterns are considered. The problem is investigated from di erent points of view that vary from the modi cation of the multilayer perceptron network with di erent con gurations that make it more intrinsically able to detect spurious information, to the introduction of novel auxiliary mechanisms which, when integrated with the MLP network, can provide an overall enhancement in the system's rejection capabilities. These di erent network con gurations are examined with respect to the characteristics of the decision regions constructed by the networks in 2-D classi cation problems, and the implications of these constructions for general pattern rejection are discussed. The technique of inversion in multilayer networks through gradient descent is used to observe the degree of visual correlation between the input patterns recognised as valid by the networks and training class prototypes. Practical experiments on the classi cation of handwritten characters are employed as a test environment for the di erent approaches described. Radial basis function networks (RBFs) are also examined in the same context and an experimental comparison is made between RBFs and the di erent MLP con gurations studied. i Publications Arising From This Work Vasconcelos, G.C., Fairhurst, M.C., and Bisset, D.L. (1995). E cient detection of spurious inputs for improving the robustness of MLP networks in practical applications. Neural Computing & Applications 3(4), 202-212, Springer-Verlag. Vasconcelos, G.C., Fairhurst, M.C., and Bisset, D.L. (1995). Investigating feedforward neural networks with respect to the rejection of spurious patterns. Pattern Recognition Letters 16 (2), 207-212. Vasconcelos, G.C., Fairhurst, M.C., and Bisset, D.L. (1994). Recognizing novelty in classi cation tasks. NIPS'94 Workshop on Novelty Detection and Adaptive Systems Monitoring, Vail CO, USA. Vasconcelos, G.C., Fairhurst, M.C., Bisset, D.L. (1994). Reliability of multilayer perceptron networks for spurious pattern rejection. Proc. 1st Brazilian Symposium on Neural Networks, 29-34, Caxambu MG, Brazil. Vasconcelos, G.C., Fairhurst, M.C., and Bisset, D.L. (1993). Enhanced reliability of multilayer perceptron networks through controlled pattern rejection. Electronics Letters 29 (3), 261-263. Vasconcelos, G.C., Fairhurst, M.C., and Bisset, D.L. (1993). The guard unit approach for rejecting patterns from untrained classes. Proc. 1993 World Congress on Neural Networks (WCNN'93), IV 256-259, Portland OR, USA. Vasconcelos, G.C., Fairhurst, M.C., Bisset, D.L. (1993). Investigating the recognition of false patterns in backpropagation networks. Proc. 3rd IEE International Conference on Arti cial Neural Networks. Brighton, U.K.. ii

[1]  Stephen M. Omohundro,et al.  Geometric learning algorithms , 1990 .

[2]  Martin A. Riedmiller,et al.  RPROP - A Fast Adaptive Learning Algorithm , 1992 .

[3]  J. Nadal,et al.  Learning in feedforward layered networks: the tiling algorithm , 1989 .

[4]  Dean A. Pomerleau,et al.  Neural Network Perception for Mobile Robot Guidance , 1993 .

[5]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[6]  Padhraic Smyth,et al.  Markov monitoring with unknown states , 1994, IEEE J. Sel. Areas Commun..

[7]  R. W. Lucky,et al.  Techniques for adaptive equalization of digital communication systems , 1966 .

[8]  Eduardo D. Sontag,et al.  Feedforward Nets for Interpolation and Classification , 1992, J. Comput. Syst. Sci..

[9]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[10]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[11]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[12]  S. Thomas Alexander,et al.  Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[13]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[14]  Michael R. W. Dawson,et al.  Modifying the Generalized Delta Rule to Train Networks of Non-monotonic Processors for Pattern Classification , 1992 .

[15]  James A. Pittman,et al.  Recognizing Hand-Printed Letters and Digits Using Backpropagation Learning , 1991, Neural Computation.

[16]  A. Linden,et al.  Inversion of multilayer nets , 1989, International 1989 Joint Conference on Neural Networks.

[17]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[18]  Germano C. Vasconcelos,et al.  Enhanced reliability of multilayer perceptron networks through controlled pattern rejection , 1993 .

[19]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[20]  Kourosh Danai,et al.  Fault Detection of Helicopter Gearboxes Using the Multi-Valued Influence Matrix Method , 1993 .

[21]  Nathalie Japkowicz,et al.  A Novelty Detection Approach to Classification , 1995, IJCAI.

[22]  Bernard Widrow,et al.  Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..

[23]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[24]  G. Dorffner UNIFIED FRAMEWORK FOR MLPs AND RBFNs: INTRODUCING CONIC SECTION FUNCTION NETWORKS , 1994 .

[25]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[26]  Robert M. French,et al.  Semi-distributed Representations and Catastrophic Forgetting in Connectionist Networks , 1992 .

[27]  Mahesan Niranjan,et al.  Neural networks and radial basis functions in classifying static speech patterns , 1990 .

[28]  Gary William Flake,et al.  Nonmonotonic activation functions in multilayer perceptrons , 1993 .

[29]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[30]  Germano C. Vasconcelos,et al.  The guard unit approach for rejecting patterns from untrained classes , 1993 .

[31]  D. Lowe,et al.  Exploiting prior knowledge in network optimization: an illustration from medical prognosis , 1990 .

[32]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[33]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[34]  Andrew R. Webb,et al.  Functional approximation by feed-forward networks: a least-squares approach to generalization , 1994, IEEE Trans. Neural Networks.

[35]  Alan F. Murray,et al.  International Joint Conference on Neural Networks , 1993 .

[36]  Neil Burgess,et al.  The Generalization of a Constructive Algorithm in Pattern Classification Problems , 1992, Int. J. Neural Syst..

[37]  Thomas Jackson,et al.  Neural Computing - An Introduction , 1990 .

[38]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[39]  P. J. Werbos,et al.  Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.

[40]  Stephen J. Roberts,et al.  A Probabilistic Resource Allocating Network for Novelty Detection , 1994, Neural Computation.

[41]  Lyle H. Ungar,et al.  Using radial basis functions to approximate a function and its error bounds , 1992, IEEE Trans. Neural Networks.

[42]  P.T. Kazlas,et al.  Neural network-based helicopter gearbox health monitoring system , 1993, Neural Networks for Signal Processing III - Proceedings of the 1993 IEEE-SP Workshop.

[43]  Geoffrey E. Hinton,et al.  The Bootstrap Widrow-Hoff Rule as a Cluster-Formation Algorithm , 1990, Neural Computation.

[44]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[45]  Bernard Widrow,et al.  The basic ideas in neural networks , 1994, CACM.

[46]  F. Girosi,et al.  A Connection Between GRBF and MLP , 1992 .

[47]  John S. Denker,et al.  Improving Rejection Performance on Handwritten Digits by Training with Rubbish , 1993, Neural Computation.

[48]  Bernard Widrow,et al.  Neural networks: applications in industry, business and science , 1994, CACM.

[49]  Teuvo Kohonen,et al.  An introduction to neural computing , 1988, Neural Networks.

[50]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[51]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[52]  Pierre Courrieu,et al.  Three algorithms for estimating the domain of validity of feedforward neural networks , 1994, Neural Networks.

[53]  John E. Moody,et al.  Note on Learning Rate Schedules for Stochastic Optimization , 1990, NIPS.

[54]  Luís B. Almeida,et al.  Speeding up Backpropagation , 1990 .

[55]  Terrence J. Sejnowski,et al.  Analysis of hidden units in a layered network trained to classify sonar targets , 1988, Neural Networks.

[56]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[57]  Wolfram Schiffmann,et al.  Optimization of the Backpropagation Algorithm for Training Multilayer Perceptrons , 1994 .

[58]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[59]  Robert J. Schalkoff,et al.  Pattern recognition - statistical, structural and neural approaches , 1991 .

[60]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[61]  Germano C. Vasconcelos,et al.  Investigating feedforward neural networks with respect to the rejection of spurious patterns , 1995, Pattern Recognit. Lett..

[62]  Neil Burgess,et al.  A Constructive Algorithm that Converges for Real-Valued Input Patterns , 1994, Int. J. Neural Syst..

[63]  Hervé Bourlard,et al.  Improving statistical speech recognition , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[64]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[65]  Richard Lippmann,et al.  Review of Neural Networks for Speech Recognition , 1989, Neural Computation.

[66]  S. G. Smyth,et al.  Designing multilayer perceptrons from nearest-neighbor systems , 1992, IEEE Trans. Neural Networks.

[67]  E I. Smieja,et al.  Reflective Modular Neural Network Systems , 1992 .

[68]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..