As one of the most essential sources of energy, byproduct gas plays a pivotal role in the steel industry, for which the flow tendency is generally regarded as the guidance for planning and scheduling in real production. In order to obtain the numeric estimation along with its reliability, the construction of prediction intervals (PIs) is highly demanded by any practical applications as well as being long term for providing more information on future trends. Bearing this in mind, in this article, a hierarchical granular computing (HGrC)-based model is established for constructing long-term PIs, in which probabilistic modeling gives rise to a long horizon of numeric prediction, and the deployment of information granularities hierarchically extends the result to be interval-valued format. Considering that the structure of this model has a direct impact on its performance, Monte-Carlo search with a policy gradient technique is then applied for reinforcement structure learning. Compared with the existing methods, the size (length) of the granules in the proposed approach is unequal so that it becomes effective for not only periodic but also nonperiodic data. Furthermore, with the use of parallel strategy, the efficiency can be also guaranteed for real-world applications. The experimental results demonstrate that the proposed method is superior to other commonly encountered techniques, and the stability of the structure learning process behaves better when compared with other reinforcement learning approaches.