Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation