Supplementary of “Improving Transformer-based Image Matching by Cascaded Capturing Spatially Informative Keypoints”