Training Backpropagation Neural Network in MapReduce

BP neural network is generally serially trained by one machine. But massive training data makes the process slow, costing too much system resources. For these problems, one effective solution is to use the MapReduce framework to do the distributed training. Some methods have been proposed, but it is still very slow when facing the neural network with complex structure. This paper presents a new method for BP neural network training based on MapReduce, MR-TMNN (MapReduce based Training in Mapper Neural Network). This method puts most of the training process into Mappers, and then emits the variations of weights and thresholds to Reducer process to do the batch update. It can effectively reduce the volume of intermediate data created by Mappers, reducing the cost of I/O, thereby accelerating training speed. Experimental results show that MR-TMNN has a better convergence without losing too much accuracy, comparing with conventional training method, and it still performs well with the complexity of neural network structure increasing.