Meta-level reasoning in reinforcement learning