Multi-modal remote sensing images have complementary information which is conducive to enhancing the performance of various applications. Image patch matching plays a crucial role in the combination of multi-modal images. However, there are great differences in appearance and texture of multi-modal images, which brings great difficulties to image patching matching. To solve this problem, we propose a novel feature decomposition framework for multi-modal image patch matching. It aims to eliminate the hinder caused by the significant difference in multi-modal images. Specifically, this paper proposes to decompose the feature of images into common feature and modal private feature. Then, only the common feature is used for image patch matching, so as to improve the matching accuracy. Experimental results on optical and SAR images demonstrate that our proposed feature decomposition framework can significantly improve the performance of multi-modal image patch matching.