Detailed 2D-3D Joint Representation for Human-Object Interaction