Investigation of Extraction, Transformation, and Loading Techniques for Traffic Data Warehouses

Archived data management systems (ADMSs) are data warehouses created to support analyses that are based on data collected by transportation operations systems. Whereas many believe that an ADMS can be created by simply exporting data from an operations system, experience in developing the Virginia ADMS illustrates that the creation of an effective ADMS requires careful attention to the extraction, transformation, and loading (ETL) process. This process refers to the activities conducted when a data warehouse is created from an operational data store. Four critical elements of an ADMS ETL process are addressed: data aggregation, data quality assessment, data imputation, and data characterization. For each element the purpose and need are documented, a review of available alternative implementation methods is presented, and, finally, a description of the approach used in the Virginia ADMS is detailed.