Extracting Building Footprint From Remote Sensing Images by an Enhanced Vision Transformer Network