Small Components Parsing VIA Multi-Feature Fusion Network

Part parsing is a fundamental task towards fine image understanding in the multimedia and visual field. At present, the researchers working on part parsing focus on objects with large components, such as human, car. This paper centers on segmenting objects with small components. We call it small components parsing. In this paper, we propose a novel strategy for small components parsing, fusing multi-feature to utilize context information. We introduce Separable Spatial Pyramid module to embed spatial context information by fusing different scale spatial features. In decoding stage, attention-based feature fusion unit is drawn to utilize semantic context information in order to highlight details. Specifically, we design the Residual Upsampling manner to recover more details, considering spatial and channel characteristics. Experiments on RHD-PARSING and CamVid datasets demonstrate our method has achieved decent performance for small components parsing, taking hand parsing as an example, and has reached competitive results in scene parsing.