Towards integrating model dynamics for sample efficient reinforcement learning