VIOLA: Object-Centric Imitation Learning for Vision-Based Robot Manipulation