Vision Transformers: From Semantic Segmentation to Dense Prediction