High-resolution Depth Maps Imaging via Attention-based Hierarchical Multi-modal Fusion