MapReduce for Scalable Neural Nets Training

The particular benefit of cloud computing is the simple scalability of large applications, and many companies have already decided to use the cloud for their infrastructures. An enterprise IT infrastructure often includes a workflow management system. In a cloud, various workflow engines can coexist, each with its specific functional responsibility. A central instance is in charge of distributing process fragments without causing high technical or economic costs. The derivation of cost functions, the determination of the fragments to be executed on the respective engines with minimal costs, is a complex issue, especially if various processes have to be executed simultaneously. This paper approaches the problem of delegating an entire process to a distributed infrastructure and shows how it can be solved efficiently with neural networks. To ensure computation performance when handling various neural networks, we use the MapReduce framework. The distributed computation capability of MapReduce can help process the mass of training data generated by system monitoring in the networks. So, the performance usage in the central instance is decreased and the entire system is able to scale with the growing infrastructure.