Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations