Coalbed methane (CBM) has emerged as one of the clean unconventional resources to supplement the rising demand of oil and gas. Analyzing and predicting CBM production performance are critical in choosing the optimal completion methods and parameters. However, the conventional numerical simulation has challenges of complicated gridding issues and expensive computational costs. The huge amount of available production data that has been collected in the field site opens up a new opportunity to develop data-driven approaches in predicting the production rate. Here, we proposed a novel physics-constrained data-driven workflow to effectively forecast the CBM productivity based on a gated recurrent unit (GRU) and multilayer perceptron (MLP) combined neural network (GRU-MLP model). The model architecture is optimized automatically by the multiobjective algorithm: nondominated sorting genetic algorithm Ⅱ (NSGA Ⅱ). The proposed framework was used to predict gas and water production in synthetic cases with various fracture-network-complexity/connectivity and two multistage fractured horizontal wells in field sites located at Ordos Basin and Qinshui Basin, China. The results indicated that the proposed GRU-MLP combined neural network was able to accurately and stably predict the production performance of CBM fractured wells in a fast manner. Compared with recurrent neural network (RNN), GRU, and long short-term memory (LSTM), the proposed GRU-MLP had the highest accuracy, stability, and generalization, especially in the peak or trough and late-time production periods, because it could capture the production-variation trends precisely under the static and dynamic physical constraints. Consequently, a physics-constrained data-driven approach performed better than a pure data-driven method. Moreover, the contributions of constraints affecting the model prediction performance were clarified, which could provide insights for the practicing engineers to choose which categorical constraints are needed to focus on and preferentially treated if there are uncertainties and unknowns in a realistic reservoir. In addition, the optimum GRU-MLP model architecture was a group of optimized solutions, rather than a single solution. Engineers can evaluate the tradeoffs within this optimal set according to the field-site requirements. This study provides a novel machine learning approach based on a GRU-MLP combined neural network to estimate production performances in naturally fractured reservoir. The method is gridless and simple, but is capable of predicting the productivity in a computational cost-effective way. The key findings of this work are expected to provide a theoretical guidance for the intelligent development in oil and gas industry.