Improving the Station-Level Demand Prediction by Using Feature Engineering in Bike Sharing Systems

The bike sharing systems have been widely accepted around the world since it is convenient and environment friendly. One of the key issues to the success operation of the system is the accurate bike demand prediction to reallocate the bikes among the stations to avoid the station jam, which refers the situation that there is no dock available for returning bikes or there is no available bike. A lot of researches have been carried out towards the bike demand prediction and the common impact factors such as time and weather have been well studied. In order to achieve better performance, the stations are often clustered into groups and the prediction is made at cluster-level. However, the effective and precise individual station demand prediction is still necessary. This paper aims at improving the accuracy of station-level demand prediction. More effective influential factors other than the time and weather are identified and a feature model composed of context feature, correlation feature and user feature (CCU) is proposed. A two-phase prediction framework is presented, where the feature engineering is used to extract features at the first step, followed by the discussion about three classic regression algorithms Experiments based on the data of Citi Bike in New York City have been conducted to verify the effectiveness of our method. The results show that our method outperforms nearly 10% than the baseline and achieve an acceptable accuracy at station-level.