Tracking Objects with 3D Representation from Videos