Policy iterations for reinforcement learning problems in continuous time and space - Fundamental theory and methods