Edinburgh Research Explorer Learning Constrained Generalizable Policies by Demonstration