Big data

antees about the optimality of these systems and strate-gies? Even if we can store the data, how do we learn from data sets that we cannot hold on a single computer or even in many computers? Can we learn from data on the fly? Moreover, our data is heterogeneous: We are observing social networks, ad click-throughs, gene sequences, protein concentrations from cells, as well as confidential personal data that must be kept secret. How do we adapt our systems and algorithms for all kinds of data? These are just some of the exciting challenges facing the big data community. For such a diverse topic like big data, it is nearly impossible to provide a comprehensive picture. Instead, in this issue we try to highlight some recent developments organized into three main themes: the theoreti-b ig data is everywhere. In just about every part of the modern world, scientists and engineers are developing new ways to measure events. Whether it's sensors, traffic cameras, sales data, Web usage, gene expression, or just about anything else, we have entered an age of truly massive data. Why do we collect this data? It's simple—to learn. We want to make predictions , quantify reality, or understand the past to optimize the decisions we make. Massive data leads to many challenges for computer scientists. We're recording petabytes of data every day. Before we even think about learning from it, how and where do we store it? What kinds of systems do we build to retrieve and analyze the data? Can we develop theoretical guar-cal foundation providing models and algorithms for reasoning about various data processing tasks, the large-scale computer systems for handling big data, and the range of applications and analyses enabled by big data from a variety of scientific domains. It has been an interesting time for big data with innovations coming simultaneously from theorists, system builders, and scientists or application designers. We hope to provide readers with an idea of the interplay between developments in these three different communities, how ideas and priorities in different communities interact, and together drive forward the development of big data analysis. Theory Opening the issue is an introduction to the theo-interest in big data has given rise to a lot of recent interest in building systems to support queries and transactions over massive quantities of data.