Provable Reinforcement Learning with a Short-Term Memory