Development of ANN with Adaptive Connections by CE

This chapter presents the use of Artificial Neural Networks (ANN) and Evolutionary Computation (EC) techniques to solve real-world problems including those with a temporal component. The development of the ANN maintains some problems from the beginning of the ANN field that can be palliated applying EC to the development of ANN. In this chapter, we propose a multilevel system, based on each level in EC, to adjust the architecture and to train ANNs. Finally, the proposed system offers the possibility of adding new characteristics to the processing elements (PE) of the ANN without modifying the development process. This characteristic makes possible a faster convergence between natural and artificial neural networks. IDEA GROUP PUBLISHING This chapter appears in the book, Artificial Neural Networks in Real-Life Applications edited by Juan R. Rabunal and Julian Dorado © 2006, Idea Group Inc. 701 E. Chocolate Avenue, Suite 200, Hershey PA 17033-1240, USA Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.idea-group.com ITB11833 72 Dorado, Pedreira & Miguélez Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. Introduction Nature has shown to be a highly efficient method for problem-solving, so much so that countless forms of life — in the end, countless solutions for survival problem — make their way in the most complex and diverse conditions. Big steps in several science fields have been achieved by means of the imitation of certain mechanisms of nature. An example of this might be the Artificial Neural Networks (ANNs) (Freeman & Skapura, 1993), which are based on brain activity and are used for pattern recognition and classification tasks. Another example of this symbiosis between science and nature is Evolutionary Computation (EC) (Holland, 1975; Bäck, 1996), whose techniques are based on evolution by natural selection and regarding a population of potential solutions for a given problem where both, mutation, and crossover operators are applied. EC techniques are mainly used at fitness and optimisation tasks. ANNs are currently the most suitable of the artificial intelligence techniques for recognition patterns. Although this technique is not entirely free of problems, during several decades it has been offering solvent systems that have been successfully transferred to the industrial environment. ANNs internal structure consists of a series of processing elements (PE) which are interconnected among them, same as biological neurons do. The ability of ANNs for problem-solving lies on, not only the type and number of PE, but also the shape of the interconnection. There are several studies about the development of PE architectures but also about the optimisation of ANNs learning algorithms, which are the ones that adjust the values of the connections. These works tackle the two main current limitations of ANNs, due to the fact that there is not mathematical basis for the calculation of the optimal architecture of an ANN, and, on the other hand, the existing algorithms for ANN learning have, in some cases, convergence problems so that the training times are quite high. Some of the most important ANNs are those as recurrent ANNs (RANNs) (Haykin, 1999) that tackle temporal problems, quite common in the real world, and different from the classical type of problems of non-temporal classification performed by static ANNs. However, the difficulties for their implementation induced the use of tricks as delays (TDNN) or feed-forward networks from recurrent networks (BPTT) for solving dynamic problems. On reaching maturity, the recurrent ANNs that use RTRL work better with dynamic phenomena than the classical ones, although they still have some problems of design and training convergence. With regards to the first of these problems, the architectural design of the network — in both ANN models, feed-forward and Recurrent — the existence of a vast amount of design possibilities allows experimentation but also sets out the doubt about which might be the best of combinations among design and training parameters. Unfortunately, there is not a mathematical basis that might back the selection of a specific architecture, and only few works (Lapedes & Farber, 1988; Cybenko, 1989) have shown lower and upper limits for PE number at some models and with restricted types of problems. Apart from these works, there are only empirical studies (Yee, 1992) about this subject. Due to this situation, it cannot be said for sure that the architecture selected is the most suitable one without performing exhaustive architectural tests. Besides, nowadays there is a clear 20 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the publisher's webpage: www.igi-global.com/chapter/development-ann-adaptiveconnections/5364