Prediction of Ticket Prices for Public Transport Using Linear Regression and Random Forest Regression Methods: A Practical Approach Using Machine Learning

Spanish High-Speed Train Service (Renfe AVE) is a ticket pricing monitoring system. It scrapes tickets pricing data periodically and stores it in a database. Ticket pricing changes based on demand and time, and there can be significant difference in price. This dataset has been designed the team of Data Scientists named Pedro Munoz and David Canones. The data is well collected for using it with machine learning models to predict fare (price) of a ticket depending upon the date, arrival and destination location, train class and train type. The dataset contains few null values which has to be taken care of to execute this research effectively. Two machine learning models were used (linear regression and random forest regression) and it was found out that random forest regressor gave a better accuracy score with both training and testing dataset of 79.06% and 80.10% respectively. Random forest model wins with more accuracy and can help a user check for train journey with automatically computed ticket fare by machine learning model.

[1]  Mauro Birattari,et al.  Adaptive memory based regression methods , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[2]  Ramona-Elena Irimia,et al.  Taxonomic revision of Rochefortia Sw. (Ehretiaceae, Boraginales) , 2016, Biodiversity data journal.

[3]  Frehiwot Mulugeta Predictive Model for ECX Coffee Contracts , 2014 .

[4]  Linear Regressions of Predicting Rainfall over Kalay Region , 2019 .

[5]  Mauro Birattari,et al.  Lazy learning: a logical method for supervised learning , 2002 .

[6]  Amit Chauhan,et al.  Bitcoin financial forecasting , 2019 .

[7]  Farshid Faghihi,et al.  Estimation of Housing Prices by Fuzzy Regression and Artificial Neural Network , 2010, 2010 Fourth Asia International Conference on Mathematical/Analytical Modelling and Computer Simulation.