Feature Learning Improved by Location Guidance and Supervision for Object Detection
暂无分享,去创建一个
In recent years, the single-stage detectors have been developed rapidly; however, compared with the multi-stage detectors, their detection precision is still relatively low. Single-stage detectors and multi-stage detectors are analyzes and compared in detail in this paper, which reveals that single-stage detectors suffer from some problems, including feature loss and inaccurate feature extraction. Therefore, this paper proposes a novel detection model, dubbed Optimized Network (OptNet), to alleviate these deficiencies. OptNet consists of three modules: pyramid of attention features, feature alignment and consistency supervision (CS). The pyramid of attention features, based on feature pyramid networks (FPNs), introduces a novel branch named attention FPN (AtFPN), which aggregates the multi-layer features of the backbone network and optimizes the object features by using lightweight attention modules. AtFPN alleviates the loss of the feature pyramid information and the blocking of feature transmission between adjacent layers. Meanwhile, it provides global information for the model. The feature alignment module aligns the anchor box to the feature by using the object location information to guide the network to extract precise object features. Finally, CS accelerates network optimization and reduces semantic differences between the features on different layers. In the detection stage, OptNet optimizes the prediction of the model with the first detection result to improve the accuracy. Experiments on the MS COCO 2017 dataset demonstrate that OptNet yields significant improvement in the detection precision.