EdgePS: Selective Parameter Aggregation for Distributed Machine Learning in Edge Computing

In this paper, we propose EdgePS, an advanced parameter server approach for distributed machine learning in edge computing scenarios. Different from the Conventional Parameter Server (CPS) approach, which performs parameter aggregation after every local training epoch, EdgePS synchronizes the parameters of all workers only when the local training cannot improve the global model performance. We first analyze how the local training will impact the performance of the global model, and then design algorithms to determine when the best time is to perform the parameter aggregation. Both real testbed experiments and extensive large scale simulations demonstrate that EdgePS can train a practical machine learning model, e.g., VGG-16, with up to 59.28% less time compared with the CPS approach. With the same training time, EdgePS can improve model accuracy by up to 30.19 % compared with the state-of-the-art distributed machine learning algorithm designed for edge computing scenarios.