Big Data Acquisition

Different data processing architectures for big data have been proposed to address the different characteristics of big data. Data acquisition has been understood as the process of gathering, filtering, and cleaning data before the data is put in a data warehouse or any other storage solution. The acquisition of big data is most commonly governed by four of the Vs: volume, velocity, variety, and value. Most data acquisition scenarios assume high-volume, high-velocity, high-variety, but low-value data, making it important to have adaptable and time-efficient gathering, filtering, and cleaning algorithms that ensure that only the high-value fragments of the data are actually processed by the data-warehouse analysis. The goals of this chapter are threefold: First, it aims to identify the current requirements for data acquisition by presenting open state-of-the-art frameworks and protocols for big data acquisition for companies. The second goal is to unveil the current approaches used for data acquisition in the different sectors. Finally, it discusses how the requirements of data acquisition are met by current approaches as well as possible future developments in the same area.