Implementation of high performance hardware based toroidal neural network with learning capability

Neural networks play an important role in artificial intelligence application domains. In most of applications, neural networks are often implemented in software form. Although the software implementation of neural networks provides flexibility, the computation speed is limited due to the sequential machine architecture. In most applications using artificial neural networks, the learning procedure is carried off-line. A large amount of mathematic operations are needed when learning task of neural networks is performed. The software implementation of neural network systems can only work well using high performance computers. The learning performance is not adequate when it is implemented on embedded systems. Following the development of modern semiconductor technologies, people attempt to realize the neural networks by hardware in order to improve the performance. Designs utilizing special architectures and parameters to achieve the performance were proposed in the past. This paper proposes a high efficiency and generic neural network hardware architecture. The architecture uses the toroidal series multiple data stream to process the back propagation neural network operations, which has the full function of recall and learning capabilities. Users can adjust the number of processor elements (PEs) in the system based on the requirement of the applications by setting the values in registers. Since the proposed system is developed in hardware, it can be integrated into embedded systems easily. The experimental results show that the system can reach much higher performance by using fewer logical elements while maintaining flexibility