Automating data integration with HiperFuse

Integrating heterogeneous datasets has been a significant barrier to many analytics tasks, due to the variety in structure and level of cleanliness of raw datasets requiring one-off ETL code. We propose HiperFuse, which significantly automates the data integration process by providing a declarative interface, robust type inference, extensible domain-specific data models, and a data integration planner which optimizes for plan completion time.