Rejoinder to 'Reinforcement learning behaviors in sponsored search'