Image fusion technology is widely used in different areas and can integrate complementary and relevant information of source images captured by multiple sensors into a unitary synthetic image. Image fusion technology as an efficient way to integrate information from multiple images plays a more and more important role in smart city. The quality of fused image affects the accuracy, efficiency, and robustness of the related applications. Existing sparse representation-based image fusion methods consist of overly complete and redundant dictionaries learning and sparse coding. However, overly complete and redundant dictionary does not consider the discriminative ability of dictionaries that may seriously affect the image fusion. A good dictionary is the key to a successful image fusion technique. To construct a discriminative dictionary, a novel framework that integrates an image-patches clustering and online dictionary learning methods is proposed for visible-infrared image fusion. The comparison experiments with existing solutions are used to validate and demonstrate the effectiveness of the proposed solution for image fusion.