Stochastic Approximation versus Sample Average Approximation for population Wasserstein barycenter calculation
暂无分享,去创建一个
In machine learning and optimization community there are two main approaches for convex risk minimization problem. The first approach is Stochastic Approximation (SA) and the second one is Sample Average Approximation (SAA) with proper regularization in non-strongly convex case. At the moment, it is known that both approaches are on average equivalent (up to a logarithmic factor) in terms of oracle complexity (required number of stochastic gradient evaluations). What is the situation with total complexity? The answer depends on specific problem. However, starting from work \cite{nemirovski2009robust} it was generally accepted that SA is better than SAA. Nevertheless, in case of large-scale problems SA may run out of memory since storing all data on one machine and organizing online access to it can be impossible without communications with other machines. SAA in contradistinction to SA allows parallel/distributed calculations. In this paper, we show that SAA may outperform SA in the problem of calculating an estimation for population ($\mu$-entropy regularized) Wasserstein barycenter even for non-parallel (non-decentralized) setup.