On neural network hardware and programming paradigms

Implementation guidelines and performance benchmarks for dedicated neural network hardware and software are proposed. An example of a dedicated neural network processor and the implementation of a popular neural network architecture and learning algorithm on this hardware is presented, emphasizing the specific architecture and programming paradigm. Finally, the performance of the implementation is benchmarked in a single-processor and multiple-processor environment against a supercomputer. Such implementations are aimed for neural network application requiring rapid online learning.