Text Generation with Efficient (Soft) Q-Learning