Efficient and Robust Video Object Segmentation Through Isogenous Memory Sampling and Frame Relation Mining