Self-improvement Based on Reinforcement Learning, Planning and Teaching