An Advanced Data Warehouse for Integrating Large Sets of GPS Data

GPS data recorded from driving vehicles is available from many sources and is a very good data foundation for answering traffic related queries. However, most approaches so far have not considered combining GPS data from many sources into a single data warehouse. Further, the integration of GPS data with fuel consumption data (from the so-called CAN bus in the vehicles) and weather data has not been done. In this paper, we propose a data warehouse design for handling GPS data, fuel consumption data, and weather data. The design is fully implemented in a running system using the PostgreSQL DBMS. The system has been in production since March 2011 and the main fact table contains today approximately 3.4 billion rows from 16 different data sources. We show that the system can be used for a number of novel traffic related analyses such as relating the fuel consumption of vehicles with the road network and road congestion.