Visualisation Tools for Understanding Big Data
暂无分享,去创建一个
Visualisation tools for understanding big data (1) In recent years, fuelled by continuing technological advances, there has been an explosion in the creation and publication of visualisations of what has come to be called ‘big data’. In the last editorial (Batty, 2012), we wrote about how the rise of the smart city through its routine instrumentation is leading to a generation of enormous real-time datasets—big data—that have the potential for providing us with entirely new information to reveal the functioning of cities at ne scale and over very short time periods. nderstanding such data, however, remains a major challenge. Not so long ago, mapping the distributions of phenomena from such large datasets required weeks of data preparation even before its analysis could begin. Now the volume of data released each day exceeds anything that could be collected in the typical academic lifetime of a generation ago. In fact, informed opinion suggests the world’s information (data) is doubling every two years and last year (2011) 1.8 zettabytes was collected, a zettabyte being 2 to the 70th power, or 10 to the 21st power. In fact it is hard to visualise this number, never mind the data. Catone (2011) suggests that it is equivalent to the storage in 57.5 billion 32 GB iPads but, whatever the analogy, the number is too big to comprehend. Needless to say in the face of such proliferation, most of this information will be lost despite the possibility that its long-term storage in digital archives is ‘still’ theoretically possible. Private companies that, for the rst time, collect more personal information than central government have fuelled this transformation in data production. For example, The McKinsey Global Institute (2011) reports that fteen out of seventeen business sectors in the nited States now hold more data on average per company than the Library of Congress. This shift is signi cant in the context of visualisation as the quality of data, especially with regard to its representativeness, does not necessarily increase with volume and will be a lower priority for many, especially in comparison with national census agencies. This is a major issue for the K’s f ce for National Statistics as it seeks to phase out the decennial Population Census. That said, the majority of big datasets are most representative of urban populations and so have much to offer wide-ranging studies of urban complexity. Increasingly, they are also frequently temporal, thus offering the potential for the analysis of ows on an unprecedented scale (Batty and Cheshire, 2011). Data is increasingly about interactions and relations, about networks and connections and it is little surprise that the current cutting edge of visualisation is in visualising networks. The temporal element of big datasets extends to the increasing number of real-time feeds as cities seek to become smarter and the multiple infrastructures within them become better connected. In London, for example, real-time feeds exist for everything from the current depth of the iver Thames through to the position of London nderground trains on the network and waiting times at bus stops. Context can be added to these feeds through timetable information (to help calculate delays), passenger ow information, and a wealth of socioeconomic datasets. In tabulated form these data can easily extend to billions of rows and require hundreds, if not thousands, of gigabytes of storage space. So, in this evolving ‘big data’ landscape how can we, as researchers, contribute and what does visualisation have to offer? Although we cannot show it here, but instead direct readers to its display at http:// mappinglondon.co.uk/2012/04/17/mapped-every-bus-trip-in-london/, London’s bus network is
[1] Michael Batty,et al. Cities as Flows, Cities of Flows , 2011 .
[2] Michael Batty,et al. Smart Cities, Big Data , 2012 .