Geospatial Big Data Handling Theory and Methods: A Review and Research Challenges

Big data has now become a strong focus of global interest that is increasingly attracting the attention of academia, industry, government and other organizations. Big data can be situated in the disciplinary area of traditional geospatial data handling theory and methods. The increasing volume and varying format of collected geospatial big data presents challenges in storing, managing, processing, analyzing, visualizing and verifying the quality of data. This has implications for the quality of decisions made with big data. Consequently, this position paper of the International Society for Photogrammetry and Remote Sensing (ISPRS) Technical Commission II (TC II) revisits the existing geospatial data handling methods and theories to determine if they are still capable of handling emerging geospatial big data. Further, the paper synthesises problems, major issues and challenges with current developments as well as recommending what needs to be developed further in the near future. Keywords: Big data, Geospatial, Data handling, Analytics, Spatial Modeling, Review

[1]  M. Haklay How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets , 2010 .

[2]  Michael F. Goodchild,et al.  A parallel computing approach to fast geostatistical areal interpolation , 2011, Int. J. Geogr. Inf. Sci..

[3]  Michael Batty,et al.  Visualisation Tools for Understanding Big Data , 2012 .

[4]  J. Ruff,et al.  Information Overload: Causes, Symptoms and Solutions , 2002 .

[5]  Michael Batty,et al.  Fractal Cities: A Geometry of Form and Function , 1996 .

[6]  Paul A. Longley,et al.  Grid‐enabling Geographically Weighted Regression: A Case Study of Participation in Higher Education in England , 2010, Trans. GIS.

[7]  P. Pfeifer,et al.  A Three-Stage Iterative Procedure for Space-Time Modeling , 1980 .

[8]  Lars Kulik,et al.  Location privacy and location-aware computing , 2006 .

[9]  Cyrus Shahabi,et al.  Approximate voronoi cell computation on spatial data streams , 2007, The VLDB Journal.

[10]  J. Mennis,et al.  Mining Association Rules in Spatio-Temporal Data , 2003 .

[11]  Arzu Çöltekin,et al.  The Next Generation of Atlas User Interfaces: A User Study with "Digital Natives" , 2014, CARTOCON.

[12]  Martin Tomko,et al.  Supporting Urban Informatics through a Big Data Analytics Online Workbench , 2015 .

[13]  Alfred Stein,et al.  An application of problem and product ontologies for the revision of beach nourishments , 2005, Int. J. Geogr. Inf. Sci..

[14]  Bernard Delyon,et al.  Nonlinear black-box models in system identification: Mathematical foundations , 1995, Autom..

[15]  Jiaqiu Wang,et al.  A Dynamic Spatial Weight Matrix and Localized Space–Time Autoregressive Integrated Moving Average for Network Modeling , 2014 .

[16]  Peter E. Thornton,et al.  Big data visual analytics for exploratory earth system simulation analysis , 2013, Comput. Geosci..

[17]  Stephan Winter,et al.  Citizens as Database: Conscious Ubiquity in Data Collection , 2011, SSTD.

[18]  R. Kitchin,et al.  Big Data, new epistemologies and paradigm shifts , 2014, Big Data Soc..

[19]  Per Bak,et al.  How Nature Works: The Science of Self-Organised Criticality , 1997 .

[20]  Shashi Shekhar,et al.  Spatial big-data challenges intersecting mobility and cloud computing , 2012, MobiDE '12.

[21]  Shuliang Wang,et al.  Spatial Data Mining: A Perspective of Big Data , 2014, Int. J. Data Warehous. Min..

[22]  Paul A. Longley,et al.  The Geotemporal Demographics of Twitter Usage , 2015 .

[23]  Paul R. Havig,et al.  VAST Challenge 2012: Visual analytics for big data , 2012, IEEE VAST.

[24]  Holli Riebeek Big Data Helps Scientists Dig Deeper : Feature Articles , 2015 .

[25]  Daniel A. Keim,et al.  Visual analytics for the big data era — A comparative review of state-of-the-art commercial systems , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[26]  Ben Shneiderman,et al.  The big picture for big data: visualization. , 2014, Science.

[27]  Alfred Stein,et al.  An ontology of slums for image-based classification , 2012, Comput. Environ. Urban Syst..

[28]  Tao Cheng,et al.  Spatiotemporal Data Mining , 2017, Encyclopedia of GIS.

[29]  Alfred Stein,et al.  Geospatial analysis of HIV-Related social stigma: A study of tested females across mandals of Andhra Pradesh in India , 2010 .

[30]  Mary Hegarty,et al.  The Cognitive Science of Visual-Spatial Displays: Implications for Design , 2011, Top. Cogn. Sci..

[31]  K. Mcdougall Using volunteered information to map the Queensland floods , 2011 .

[32]  Bin Jiang Head/tail breaks for visualization of city structure and dynamics , 2015 .

[33]  Arzu Çöltekin,et al.  Towards (Re)Constructing Narratives from Georeferenced Photographs through Visual Analytics , 2014 .

[34]  Keith C. Clarke,et al.  A general-purpose parallel raster processing programming library test application using a geographic cellular automata model , 2010, Int. J. Geogr. Inf. Sci..

[35]  Robert E. Roth,et al.  An Empirically-Derived Taxonomy of Interaction Primitives for Interactive Cartography and Geovisualization , 2013, IEEE Transactions on Visualization and Computer Graphics.

[36]  Alfred Stein,et al.  Visualizing and quantifying the movement of vegetative drought using remote-sensing data and GIS , 2013, Int. J. Geogr. Inf. Sci..

[37]  Michael R. Evans,et al.  Spatial big data: Case studies on volume, velocity, and variety , 2014 .

[38]  Jiawei Han,et al.  Geographic data mining and knowledge discovery: An overview , 2009 .

[39]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[40]  D. R. Cutler,et al.  Utah State University From the SelectedWorks of , 2017 .

[41]  Monika Sester,et al.  Quality Analysis of OpenStreetMap Data Based on Application Needs , 2011, Cartogr. Int. J. Geogr. Inf. Geovisualization.

[42]  H. Smallman,et al.  Choosing and using geospatial displays: effects of design on performance and metacognition. , 2012, Journal of experimental psychology. Applied.

[43]  Andrew U. Frank,et al.  Tiers of ontology and consistency constraints in geographical information systems , 2001, Int. J. Geogr. Inf. Sci..

[44]  Uznir Ujang,et al.  3D Hilbert Space Filling Curves in 3D City Modeling for Faster Spatial Queries , 2014, Int. J. 3 D Inf. Model..

[45]  Lee D. Han,et al.  Online-SVR for short-term traffic flow prediction under typical and atypical traffic conditions , 2009, Expert Syst. Appl..

[46]  Arzu Çöltekin,et al.  Area of Interest Based Interaction and GeoVisualization with WebGL , 2012 .

[47]  Christian Lucas,et al.  A step towards real-time analysis of major disaster events based on tweets , 2013, ISCRAM.

[48]  B. Jiang Head/Tail Breaks: A New Classification Scheme for Data with a Heavy-Tailed Distribution , 2012, 1209.2801.

[49]  J. Paul Elhorst,et al.  Specification and Estimation of Spatial Panel Data Models , 2003 .

[50]  Jascha Hoffman Q&A: The data visualizer , 2012, Nature.

[51]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[52]  Stan Openshaw,et al.  High-Performance Computing and Geography: Developments, Issues, and Case Studies , 1998 .

[53]  Alexei Pozdnoukhov,et al.  Machine Learning for Spatial Environmental Data: Theory, Applications, and Software , 2009 .

[54]  Jan van de Kassteele,et al.  A model for external drift kriging with uncertain covariates applied to air quality measurements and dispersion model output , 2006 .

[55]  Umut A. Acar,et al.  Streaming big data with self-adjusting computation , 2013, DDFP '13.

[56]  B. Jiang,et al.  Ht-Index for Quantifying the Fractal or Scaling Structure of Geographic Features , 2013, 1305.0883.

[57]  Jiaqiu Wang,et al.  Local online kernel ridge regression for forecasting of urban travel times , 2014 .

[58]  Michael F. Goodchild,et al.  Please Scroll down for Article International Journal of Digital Earth Crowdsourcing Geographic Information for Disaster Response: a Research Frontier Crowdsourcing Geographic Information for Disaster Response: a Research Frontier , 2022 .

[59]  Tomoki Nakaya,et al.  Visualising Crime Clusters in a Space‐time Cube: An Exploratory Data‐analysis Approach Using Space‐time Kernel Density Estimation and Scan Statistics , 2010, Trans. GIS.

[60]  Arzu Çöltekin,et al.  Exploring the efficiency of users' visual analytics strategies based on sequence analysis of eye movement recordings , 2010, Int. J. Geogr. Inf. Sci..

[61]  Terence L. van Zyl,et al.  The Sensor Web: systems of sensor systems , 2009, Int. J. Digit. Earth.

[62]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[63]  Mudhakar Srivatsa,et al.  Efficient spatial query processing for big data , 2014, SIGSPATIAL/GIS.

[64]  Shashi Shekhar,et al.  Geospatial Analysis , 2008, Encyclopedia of GIS.

[65]  Michael Batty,et al.  Smart Cities, Big Data , 2012 .

[66]  Alfred Stein,et al.  Image Mining for Modeling of Forest Fires From Meteosat Images , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[67]  M. Kulldorff,et al.  A Space–Time Permutation Scan Statistic for Disease Outbreak Detection , 2005, PLoS medicine.

[68]  Shan Suthaharan,et al.  Big data classification: problems and challenges in network intrusion prediction with machine learning , 2014, PERV.

[69]  Allison Kealy,et al.  KEYNOTE - Collaborative Positioning - Concepts and Approaches for More Robust Positioning , 2015 .

[70]  Jun Wei Liu,et al.  Mining Association Rules in Spatio‐Temporal Data: An Analysis of Urban Socioeconomic and Land Cover Change , 2005, Trans. GIS.

[71]  Gennady L. Andrienko,et al.  Tracing the German centennial flood in the stream of tweets: first lessons learned , 2013, GEOCROWD '13.

[72]  Jeffrey Heer,et al.  imMens: Real‐time Visual Querying of Big Data , 2013, Comput. Graph. Forum.

[73]  Naphtali Rishe,et al.  SksOpen: Efficient Indexing, Querying, and Visualization of Geo-spatial Big Data , 2013, 2013 12th International Conference on Machine Learning and Applications.

[74]  Arzu Çöltekin,et al.  AN OPEN SOURCE GEOVISUAL ANALYTICS TOOLBOX FOR MULTIVARIATE SPATIO-TEMPORAL DATA IN ENVIRONMENTAL CHANGE MODELLING , 2012 .

[75]  C. K. Jha,et al.  Handling Big Data Efficiently by Using Map Reduce Technique , 2015, 2015 IEEE International Conference on Computational Intelligence & Communication Technology.

[76]  Martin C. Rinard,et al.  Verifying quantitative reliability for programs that execute on unreliable hardware , 2013, OOPSLA.

[77]  Doris Dransch,et al.  Assessing Volunteered Geographic Information for Rapid Flood Damage Estimation , 2009 .

[78]  Shashi Shekhar,et al.  Benchmarking Spatial Big Data , 2012, WBDB.

[79]  Jiawei Han,et al.  Geographic Data Mining and Knowledge Discovery , 2001 .

[80]  Gerard B. M. Heuvelink,et al.  Space-Time Geostatistics for Geography: A Case Study of Radiation Monitoring Across Parts of Germany , 2010 .

[81]  Divesh Srivastava,et al.  Data quality: The other face of Big Data , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[82]  A. Stein,et al.  Space-time statistics for environmental and agricultural related phenomena , 1998, Environmental and Ecological Statistics.

[83]  Bin Jiang,et al.  Cognitive and Usability Issues in Geovisualization , 2001 .

[84]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[85]  Christian P. Robert,et al.  Statistics for Spatio-Temporal Data , 2014 .

[86]  P. Pfeifer,et al.  A Three-Stage Iterative Procedure for Space-Time Modeling Phillip , 2012 .

[87]  P. Rees,et al.  Creating the UK National Statistics 2001 output area classification , 2007 .

[88]  Zhong Ming,et al.  On Spatial Data Mining under Big Data , 2013 .

[89]  Sava Mintchev User-Defined Rules Made Simple with Functional Programming , 2014, BIS.

[90]  Alfred Stein,et al.  Modeling Dynamic Beach Objects Using Spatio-Temporal Ontologies , 2006 .

[91]  Devis Tuia,et al.  Learning wind fields with multiple kernels , 2011 .

[92]  William Marshall,et al.  Planet Labs’ Remote Sensing Satellite System , 2013 .

[93]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[94]  Tristan Henderson,et al.  Privacy in Location-Aware Computing Environments , 2007, IEEE Pervasive Computing.

[95]  Per Bak,et al.  How Nature Works , 1996 .

[96]  Daniel R. Montello,et al.  Cognitive Map-Design Research in the Twentieth Century: Theoretical and Empirical Approaches , 2002 .

[97]  Michael F. Goodchild,et al.  The quality of big (geo)data , 2013 .

[98]  Andrew U. Frank,et al.  Data Quality Ontology: An Ontology for Imperfect Knowledge , 2007, COSIT.

[99]  Derya Birant,et al.  ST-DBSCAN: An algorithm for clustering spatial-temporal data , 2007, Data Knowl. Eng..

[100]  Richard L. Hudson,et al.  The Misbehavior of Markets: A Fractal View of Risk, Ruin, and Reward , 2004 .

[101]  Noel A Cressie,et al.  Statistics for Spatio-Temporal Data , 2011 .

[102]  Monika Sester,et al.  Rainfall Estimation with a Geosensor Network of Cars – Theoretical Considerations and First Results , 2013 .

[103]  Lionel M. Ni,et al.  CloST: a hadoop-based storage system for big spatio-temporal data analytics , 2012, CIKM '12.

[104]  Lutz Frommberger,et al.  Micro-mapping with smartphones for monitoring agricultural development , 2013, ACM DEV '13.

[105]  A. Zipf,et al.  A Comparative Study of Proprietary Geodata and Volunteered Geographic Information for Germany , 2010 .

[106]  Kwan-Liu Ma,et al.  Big-Data Visualization , 2013, IEEE Computer Graphics and Applications.

[107]  Tao Cheng,et al.  Advances in geocomputation (1996-2011) , 2012, Comput. Environ. Urban Syst..

[108]  Matthew Zook,et al.  Beyond the geotag: situating ‘big data’ and leveraging the potential of the geoweb , 2013 .

[109]  Esteban Zimányi,et al.  AROM: Processing big data with Data Flow Graphs and functional programming , 2012, 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings.

[110]  F. Anton,et al.  Geospatial Big Data Handling Theory and Methods: A Review and Research Challenges , 2015 .

[111]  Hanning Yuan,et al.  Spatial Data Mining in the Context of Big Data , 2013, ICPADS 2013.

[112]  Bin Jiang,et al.  The Evolution of Natural Cities from the Perspective of Location-Based Social Media , 2014, Digital Social Networks and Travel Behaviour in Urban Environments.

[113]  Dieter Pfoser Crowdsourcing Geographic Information , 2016 .

[114]  John Hughes,et al.  Why Functional Programming Matters , 1989, Comput. J..

[115]  Bin Jiang,et al.  Geospatial analysis requires a different way of thinking: the problem of spatial heterogeneity , 2015 .

[116]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .

[117]  Renée J. Miller,et al.  Discovering data quality rules , 2008, Proc. VLDB Endow..

[118]  Loey Kathleen Knapp A task analysis approach to the visualization of geographic data , 1995 .

[119]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[120]  Jaegul Choo,et al.  Customizing Computational Methods for Visual Analytics with Big Data , 2013, IEEE Computer Graphics and Applications.

[121]  William Mackaness,et al.  Abstracting Geographic Information in a Data Rich World: Methodologies and Applications of Map Generalisation , 2014 .

[122]  Emad A. Mohammed,et al.  Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends , 2014, BioData Mining.

[123]  Pascal Neis,et al.  The Street Network Evolution of Crowdsourced Maps: OpenStreetMap in Germany 2007-2011 , 2011, Future Internet.

[124]  James Spellos Using social media. , 2013, Journal of continuing education in nursing.

[125]  Benjamin B. Bederson,et al.  A review of overview+detail, zooming, and focus+context interfaces , 2009, CSUR.

[126]  M. Anusha,et al.  Big Data-Survey , 2016 .

[127]  R. Kitchin,et al.  Big data and human geography , 2013 .

[128]  Liu Sheng-lan Dimensionality Reduction Algorithm Based on Density Portrayal , 2011 .

[129]  A. Guerry,et al.  Using social media to quantify nature-based tourism and recreation , 2013, Scientific Reports.

[130]  François Anton,et al.  RF Power Consumption Emulation Optimized with Interval Valued Homotopies , 2011 .

[131]  Wei Huang,et al.  Predicting human mobility with activity changes , 2015, Int. J. Geogr. Inf. Sci..

[132]  Jane Drummond Location Privacy and Location-Aware Computing , 2006 .

[133]  R. Baker Kearfott,et al.  Introduction to Interval Analysis , 2009 .

[134]  T. Cheng,et al.  Modifiable Temporal Unit Problem (MTUP) and Its Effect on Space-Time Cluster Detection , 2014, PloS one.

[135]  Tao Cheng,et al.  Using a moving window SVM classification to infer travel mode from GPS data , 2011 .

[136]  Richard O. Sinnott,et al.  The Australia urban research gateway , 2015, Concurr. Comput. Pract. Exp..

[137]  Rajeev Raman,et al.  Streaming Algorithms for Data in Motion , 2007, ESCAPE.

[138]  Felice C. Frankel,et al.  Big data: Distilling meaning from data , 2008, Nature.

[139]  M. Goodchild,et al.  Data-driven geography , 2014, GeoJournal.

[140]  Albert Y. Zomaya,et al.  Remote sensing big data computing: Challenges and opportunities , 2015, Future Gener. Comput. Syst..

[141]  François Anton,et al.  RF subsystem power consumption and induced radiation emulation , 2013 .

[142]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .

[143]  Alfred Stein,et al.  A system of types and operators for handling vague spatial objects , 2007, Int. J. Geogr. Inf. Sci..

[144]  Alan M. MacEachren,et al.  Design and Implementation of a Model, Web-based, GIS-Enabled Cancer Atlas , 2008 .

[145]  Arzu Çöltekin,et al.  Space-variant image coding for stereoscopic media , 2009, 2009 Picture Coding Symposium.

[146]  Viktor Mayer-Schnberger,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .

[147]  Eleni I. Vlahogianni,et al.  Statistical methods for detecting nonlinearity and non-stationarity in univariate short-term time-series of traffic volume , 2006 .

[148]  Shashi Shekhar,et al.  Identifying patterns in spatial information: A survey of methods , 2011, WIREs Data Mining Knowl. Discov..

[149]  Javier Solana,et al.  Big Data: A Revolution that Will Transform How We Work, Live and Think , 2014 .

[150]  Arzu Çöltekin,et al.  High Quality Geographic Services and Bandwidth Limitations , 2011, Future Internet.

[151]  Terry A. Slocum Thematic Cartography and Visualization , 1998 .

[152]  Alfred Stein,et al.  Urban social vulnerability assessment with physical proxies and spatial metrics derived from air- and spaceborne imagery and GIS data , 2009 .