A semantic segmentation method of buildings in remote sensing image based on improved UNet

Aiming at the problem of model instability and overfitting of deep neural networks with the deepening of the number of network layers, the current mainstream method is to use batch normalization (BN) to alleviate them. However, since the BN method is more sensitive to batch size when the batch size is small, the model performance will be poor. For a relatively large model, due to the limitation of video memory, the batch size cannot take a large value, limiting the model's performance. Because of the dependence of BN on batch size, this paper adopts group normalization (GN) instead of batch normalization (BN) in the UNet network to alleviate the impact of the model on batch size. Then experiments are carried out on the WHUBuilding dataset. The experimental results show that the improved model (UNet-GN) improves the mean intersection over union (MIoU) and mean pixel accuracy (MPA) by 10.66% and 1.65% respectively compared with the original model (UNet-BN).