Pipeline infrastructures, moving either gas or oil from one place to another through their entire lifespan, suffer from internal corrosion. This phenomenon could be very dangerous both for the environment and human being. The former due to potential leakages of the fluids carried by the infrastructure itself, whereas the latter due to accidents which may cause explosions in presence of gas leakages. Therefore, it is crucial to design predictive mechanisms able to improve prevention and control of this phenomenon [1]. Unfortunately, the pipeline corrosion is not understood to the point of developing a mechanistic model, which would solve the prevention and control needs associated to the management of such infrastructures. Moreover, the phenomenon is complex enough to cause semi-empirical models to fail in reproducing its behavior. Recently, Machine Learning (ML) techniques have proven their capabilities in modeling complex phenomena given enough and appropriate data, becoming a promising potential solution for corrosion prediction. Unfortunately, in the literature, the proposed solutions are based on small data sets or the performance evaluations are not appropriately performed impairing the claims and the obtained results. For these reasons, in this paper, we introduce a ML-based approach to model the corrosion phenomenon comprising the data set creation, the definition of the ML-based model and its evaluation. Finally, we apply the above mentioned solution on real-world data.