Towards 3D Scene Understanding by Referring Synthetic Models