Edge gradient feature and long distance dependency for image semantic segmentation

Image semantic segmentation is a challenging problem for low-level computer vision. Recently, deep convolutional neural networks (DCNNs) have been proved to achieve outstanding performance in image semantic segmentation. Most current methods still have some problems in segmenting the object edges and the integrity of objects. In this study, the authors first construct the difference-pooling module in the DCNNs to extract the object edge gradients and get finer boundary in segmentation results. Then the combination of the pyramid pooling module and the atrous spatial pyramid pooling extracts the image global features and the context structure information by building long-distance dependency between pixels, which is just like a simple fully connected conditional random field (CRF). Different from other methods, the proposed method does not need extra pre-processing and post-processing steps, such as extracting gradient features by the traditional algorithm and building context relationships by CRF. Finally, the experimental results on the PASCAL VOC2012 benchmark indicate that the proposed model can obtain the finer boundaries and more complete parts.