Learning powerful policies and better dynamics models by encouraging consistency