Big Data ETL Implementation Approaches: A Systematic Literature Review (P)

Extract, transform, load (ETL) is an essential technique for integrating data from multiple sources into a data warehouse. ETL is applicable to data warehousing, big data, and business intelligence. Through a systematic literature review of 97 papers, this research identifies and evaluates the current approaches used to implement existing ETL solutions. We found that conceptual modeling such as UML, BPMN, and MDA is the most popular approach used to implement ETL solutions. However, innovative approaches such as machine learning, artificial intelligence, and robotics are either under-utilized or not used at all to develop ETL solutions. Additionally, we discuss the implications of these to ETL research and practice.