Airborne laser scanning point cloud semantic labeling, which aims to identify the category of each point, plays a significant role in many applications, such as forest observing, powerline extraction, etc. Under the guidance of deep learning technology, the interpretation thought of point clouds has also greatly changed. However, owing to the irregular and unordered natures of point clouds, it is relatively difficult for classification model to distinguish some objects with similar geometry by single-modal data only. Fortunately, additional gain information, e.g., color spectrum which can be complementary to geometric information, is able to effectively promote the classification effect. Therefore, the design of fusion strategy is a critical part in model construction. In this article, aiming to capture more abstract semantic information for color spectrum data, we elaborate a color spectrum fusion (CSF) module. It can be flexibly integrated into a classification pipeline with just negligible parameters. Then, we expand data fusion thoughts for point clouds and color spectrum and investigate three possible fusion strategies. Accordingly, we develop three architectures to construct CSF-Nets. Ultimately, by taking a weighted cross entropy loss, we can train our CSF-Nets in an end-to-end manner. Experiments on two extensively used datasets: Vaihingen 3D and LASDU show that the presented three fusion approaches all can improve the performance, while the earlier fusion strategy performs the best. Besides, compared with other well-performed methods, CSF-Net is still able to achieve satisfactory performance on overall accuracy and m$F_{1}$-score indicator. This also validates the effectiveness of our multimodal fusion network.