Is High Variance Unavoidable in RL? A Case Study in Continuous Control