End-to-end self-driving policy based on the deep deterministic policy gradient algorithm considering the state distribution