Exploration and Regularization of the Latent Action Space in Recommendation