Shared Autonomy Based on Human-in-the-loop Reinforcement Learning with Policy Constraints