STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation