Variational Inference for Model-Free and Model-Based Reinforcement Learning