Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning