Scale-aware network with modality-awareness for RGB-D indoor semantic segmentation