A Holistic Approach for High-level Programming of Next-generation Data-intensive Applications Targeting Distributed Heterogeneous Computing Environment

Abstract The intrinsic richness and heterogeneity of large amount of data is paired with the extreme complexity in its storing and processing, as well as with the heterogeneity of their processing environments, ranging from super computers to federations of Cloud data-centres. This makes the conception, definition and implementation of software tools for programming applications dealing with very large amount of data really challenging from different perspectives, ranging from technological issues to economic concerns. We propose an approach focused on data-intensive applications that goes beyond the state of the art allowing a seamless exploitation of heterogeneous and distributed resources and satisfying users’ needs on data processing providing a dynamically determined set of features, depending on the running environment, the application, the user requirements.