Big Data and cloud computing: innovation opportunities and challenges

ABSTRACT Big Data has emerged in the past few years as a new paradigm providing abundant data and opportunities to improve and/or enable research and decision-support applications with unprecedented value for digital earth applications including business, sciences and engineering. At the same time, Big Data presents challenges for digital earth to store, transport, process, mine and serve the data. Cloud computing provides fundamental support to address the challenges with shared computing resources including computing, storage, networking and analytical software; the application of these resources has fostered impressive Big Data advancements. This paper surveys the two frontiers – Big Data and cloud computing – and reviews the advantages and consequences of utilizing cloud computing to tackling Big Data in the digital earth and relevant science domains. From the aspects of a general introduction, sources, challenges, technology status and research opportunities, the following observations are offered: (i) cloud computing and Big Data enable science discoveries and application developments; (ii) cloud computing provides major solutions for Big Data; (iii) Big Data, spatiotemporal thinking and various application domains drive the advancement of cloud computing and relevant technologies with new requirements; (iv) intrinsic spatiotemporal principles of Big Data and geospatial sciences provide the source for finding technical and theoretical solutions to optimize cloud computing and processing Big Data; (v) open availability of Big Data and processing capability pose social challenges of geospatial significance and (vi) a weave of innovations is transforming Big Data into geospatial research, engineering and business values. This review introduces future innovations and a research agenda for cloud computing supporting the transformation of the volume, velocity, variety and veracity into values of Big Data for local to global digital earth science and applications.

[1]  Kagermann Henning Recommendations for implementing the strategic initiative INDUSTRIE 4.0 , 2013 .

[2]  Michael Eisenstein,et al.  Big data: The power of petabytes , 2015, Nature.

[3]  Reagan Moore,et al.  Data-intensive computing and digital libraries , 1998, CACM.

[4]  Amin Vahdat,et al.  Themis: an I/O-efficient MapReduce , 2012, SoCC '12.

[5]  Shashi Shekhar,et al.  Spatial big-data challenges intersecting mobility and cloud computing , 2012, MobiDE '12.

[6]  Tao Zhu,et al.  Green Scheduling: A Scheduling Policy for Improving the Energy Efficiency of Fair Scheduler , 2011, 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies.

[7]  P. O'Donovan,et al.  Big data in manufacturing: a systematic mapping study , 2015, Journal of Big Data.

[8]  Zhenlong Li,et al.  Contemporary Computing Technologies for Processing Big Spatiotemporal Data , 2015 .

[9]  Rajkumar Buyya,et al.  Cloud-Based Augmentation for Mobile Devices: Motivation, Taxonomies, and Open Challenges , 2013, IEEE Communications Surveys & Tutorials.

[10]  Günther Sagl,et al.  A Visual Analytics Approach for Extracting Spatio-Temporal Urban Mobility Information from Mobile Network Traffic , 2012, ISPRS Int. J. Geo Inf..

[11]  Michele Colajanni,et al.  Adaptive, scalable and reliable monitoring of big data on clouds , 2015, J. Parallel Distributed Comput..

[12]  Yann Pollet,et al.  The G.O.A.L. Approach - A Goal-Oriented Algebraic Language , 2013, ENASE.

[13]  James M. Tien,et al.  Big Data: Unleashing information , 2013, 2013 10th International Conference on Service Systems and Service Management.

[14]  Qunying Huang,et al.  A Web-Based Geovisual Analytical System for Climate Studies , 2012, Future Internet.

[15]  Rabi Prasad Padhy,et al.  RDBMS to NoSQL: Reviewing Some Next-Generation Non-Relational Database's , 2011 .

[16]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2003, Distributed and Parallel Databases.

[17]  Kapil Aggarwala,et al.  Of Spatial Data , 2006 .

[18]  David W. S. Wong,et al.  An interoperable spatiotemporal weather radar data dissemination system , 2009 .

[19]  Peter Baumann,et al.  Big Data Analytics for Earth Sciences: the EarthServer approach , 2016, Int. J. Digit. Earth.

[20]  Ming-Hsiang Tsou,et al.  Design and implementation strategy of a parallel agent-based Schelling model , 2015, Comput. Environ. Urban Syst..

[21]  Divesh Srivastava,et al.  Big data integration , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[22]  Nicolas P. Terry,et al.  Protecting Patient Privacy in the Age of Big Data , 2012 .

[23]  Florin Pop,et al.  Asymptotic scheduling for many task computing in Big Data platforms , 2015, Inf. Sci..

[24]  Qunying Huang,et al.  A data-driven framework for archiving and exploring social media data , 2014, Ann. GIS.

[25]  Cong Wang,et al.  Enabling Public Verifiability and Data Dynamics for Storage Security in Cloud Computing , 2009, ESORICS.

[26]  Naphtali Rishe,et al.  TerraFly GeoCloud , 2015, ACM Trans. Intell. Syst. Technol..

[27]  Eui-nam Huh,et al.  CTaG: An Innovative Approach for Optimizing Recovery Time in Cloud Environment , 2015, KSII Trans. Internet Inf. Syst..

[28]  Zhang Min,et al.  Study on Cloud Computing Security , 2011 .

[29]  ChooKim-Kwang Raymond,et al.  Geographical information system parallelization for spatial big data processing , 2016 .

[30]  Ajith Abraham,et al.  Significance of steganography on data security , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[31]  Patrick Th. Eugster,et al.  From the Cloud to the Atmosphere: Running MapReduce across Data Centers , 2014, IEEE Transactions on Computers.

[32]  P. N. Edwards A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming , 2010 .

[33]  Muhammad Shiraz,et al.  Big Data: Survey, Technologies, Opportunities, and Challenges , 2014, TheScientificWorldJournal.

[34]  Ranga Raju Vatsavai,et al.  Spatiotemporal data mining in the era of big spatial data: algorithms and applications , 2012, BigSpatial '12.

[35]  Wenwen Li,et al.  Constructing gazetteers from volunteered Big Geo-Data based on Hadoop , 2013, Comput. Environ. Urban Syst..

[36]  Zhenlong Li,et al.  A spatiotemporal indexing approach for efficient processing of big array-based climate data with MapReduce , 2017, Int. J. Geogr. Inf. Sci..

[37]  Jon Atli Benediktsson,et al.  Advances in Very-High-Resolution Remote Sensing [Scanning the Issue] , 2013, Proc. IEEE.

[38]  Shaowen Wang,et al.  A scalable framework for spatiotemporal analysis of location-based social media data , 2014, Comput. Environ. Urban Syst..

[39]  Francisco Herrera,et al.  Data Preprocessing in Data Mining , 2014, Intelligent Systems Reference Library.

[40]  Zhenlong Li,et al.  Cloud computing research for geosciences and applications , 2013 .

[41]  Keith W. Miller,et al.  Big Data: New Opportunities and New Challenges [Guest editors' introduction] , 2013, Computer.

[42]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[43]  Ariel Cary,et al.  Scaling geospatial searches in large spatial databases , 2011 .

[44]  Busra Ozdenizci,et al.  A Survey on Near Field Communication (NFC) Technology , 2012, Wireless Personal Communications.

[45]  Don Coppersmith,et al.  The Data Encryption Standard (DES) and its strength against attacks , 1994, IBM J. Res. Dev..

[46]  Jan Westerholm,et al.  Generating Heat Maps of Popular Routes Online from Massive Mobile Sports Tracking Application Data in Milliseconds While Respecting Privacy , 2015, ISPRS Int. J. Geo Inf..

[47]  Ching-Hsien Hsu,et al.  An Adaptive and Memory Efficient Sampling Mechanism for Partitioning in MapReduce , 2015, International Journal of Parallel Programming.

[48]  Luming Fang,et al.  An Idea of Special Cloud Computing in Forest Pests' Control , 2009, CloudCom.

[49]  Dawn J. Wright,et al.  The emergence of spatial cyberinfrastructure , 2011, Proceedings of the National Academy of Sciences.

[50]  Gagan Agrawal,et al.  Time and Cost Sensitive Data-Intensive Computing on Hybrid Clouds , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[51]  Arash Jalali,et al.  Leveraging Cloud Computing to Address Public Health Disparities: An Analysis of the SPHPS , 2012, Online journal of public health informatics.

[52]  Latifur Khan,et al.  Implementation and performance evaluation of a scheduling algorithm for divisible load parallel applications in a cloud computing environment , 2015, Softw. Pract. Exp..

[53]  José Antonio Lozano,et al.  A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.

[54]  Christophe Lefèvre,et al.  Exposing HPC and sequential applications as services through the development and deployment of a SaaS cloud , 2015, Future Gener. Comput. Syst..

[55]  Mark Gahegan,et al.  Geospatial Cyberinfrastructure: Past, present and future , 2010, Comput. Environ. Urban Syst..

[56]  Ling Liu,et al.  Computing infrastructure for big data processing , 2013, Frontiers of Computer Science.

[57]  Zhiwei Xu How much power is needed for a billion-thread high-throughput server? , 2012, Frontiers of Computer Science.

[58]  M.Y. Javed,et al.  A Performance Comparison of Data Encryption Algorithms , 2005, 2005 International Conference on Information and Communication Technologies.

[59]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[60]  Xiaoyong Du,et al.  Big data challenge: a data management perspective , 2013, Frontiers of Computer Science.

[61]  Yi Li,et al.  A Trusted-based Cloud Computing Virtual Storage System and Key Technologies , 2015, Int. J. Comput. Commun. Control.

[62]  Aaron K. Baughman,et al.  Predictive Cloud Computing with Big Data: Professional Golf and Tennis Forecasting [Application Notes] , 2015, IEEE Computational Intelligence Magazine.

[63]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[64]  Albert Y. Zomaya,et al.  Big Data Privacy in the Internet of Things Era , 2014, IT Professional.

[65]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[66]  Ck Cheng,et al.  The Age of Big Data , 2015 .

[67]  David Alan Hanson,et al.  Data security , 1979, ACM-SE 17.

[68]  John L. Schnase,et al.  MERRA Analytic Services: Meeting the Big Data challenges of climate science through cloud-enabled Climate Analytics-as-a-Service , 2013, Comput. Environ. Urban Syst..

[69]  Chaowei Phil Yang,et al.  Monitoring and evaluating the quality of Web Map Service resources for optimizing map composition over the internet to support decision making , 2011, Comput. Geosci..

[70]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[71]  Kannan Govindarajan,et al.  CDM Server: A Data Management Framework for Data Intensive Application in Internal Private Cloud Infrastructure , 2012, 2012 Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.

[72]  Shaowen Wang,et al.  FluMapper: an interactive CyberGIS environment for massive location-based social media data analysis , 2013, XSEDE.

[73]  Xiang Ju Liu Research of Big Data Processing Platform , 2014 .

[74]  John Gantz,et al.  The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East , 2012 .

[75]  Elisa Bertino,et al.  Big Data - Security and Privacy , 2015, 2015 IEEE International Congress on Big Data.

[76]  Raghunath Othayoth Nambiar,et al.  Data Management - A Look Back and a Look Ahead , 2012, WBDB.

[77]  Lida Xu,et al.  An Integrated System for Regional Environmental Monitoring and Management Based on Internet of Things , 2014, IEEE Transactions on Industrial Informatics.

[78]  Duoduo Liao,et al.  On clusterization of "big data" streams , 2012, COM.Geo '12.

[79]  Hong Zhao,et al.  Data Security and Privacy Protection Issues in Cloud Computing , 2012, 2012 International Conference on Computer Science and Electronics Engineering.

[80]  Constantinos Evangelinos,et al.  Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere- , 2008 .

[81]  Han Liu,et al.  Statistical analysis of big data on pharmacogenomics. , 2013, Advanced drug delivery reviews.

[82]  Mattia Monga,et al.  MaRDiGraS: Simplified Building of Reachability Graphs on Large Clusters , 2013, RP.

[83]  Joel H. Saltz,et al.  Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce , 2013, Proc. VLDB Endow..

[84]  Qunying Huang,et al.  Developing Subdomain Allocation Algorithms Based on Spatial and Communicational Constraints to Accelerate Dust Storm Simulation , 2016, PloS one.

[85]  Eui-nam Huh,et al.  A solution of thin-thick client collaboration for data distribution and resource allocation in cloud computing , 2013, The International Conference on Information Networking 2013 (ICOIN).

[86]  Jin Wang,et al.  Privacy-Preserving Smart Similarity Search Based on Simhash over Encrypted Data in Cloud Computing , 2015 .

[87]  Mei-Hua Hsu,et al.  A personalized English learning recommender system for ESL students , 2008, Expert Syst. Appl..

[88]  Jun Zhang,et al.  Cloud Computing Resource Scheduling and a Survey of Its Evolutionary Approaches , 2015, ACM Comput. Surv..

[89]  John M. Carroll,et al.  HBLAST: Parallelised sequence similarity - A Hadoop MapReducable basic local alignment search tool , 2015, J. Biomed. Informatics.

[90]  H. K. Ramapriyan,et al.  The Role and Evolution of NASA's Earth Science Data Systems , 2015 .

[91]  Yijie Wang,et al.  A General Scalable and Elastic Content-Based Publish/Subscribe Service , 2015, IEEE Transactions on Parallel and Distributed Systems.

[92]  Xiaorong Li,et al.  Automatic VM Allocation for Scientific Application , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[93]  Ciprian Dobre,et al.  Parallel Programming Paradigms and Frameworks in Big Data Era , 2013, International Journal of Parallel Programming.

[94]  Qunying Huang,et al.  A Service Brokering and Recommendation Mechanism for Better Selecting Cloud Services , 2014, PloS one.

[95]  Marta C. González,et al.  The path most traveled: Travel demand estimation using big data resources , 2015, Transportation Research Part C: Emerging Technologies.

[96]  William H. Dutton,et al.  Clouds, big data, and smart assets: Ten tech-enabled business trends to watch , 2010 .

[97]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[98]  Lavanya Ramakrishnan,et al.  Performance and energy efficiency of big data applications in cloud environments: A Hadoop case study , 2014, J. Parallel Distributed Comput..

[99]  Feng Xu,et al.  Survey of Research on Big Data Storage , 2013, 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science.

[100]  Yijie Wang,et al.  Scalable and elastic total order in content-based publish/subscribe systems , 2015, Comput. Networks.

[101]  Gary N. Geller,et al.  The model web: a concept for ecological forecasting , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[102]  Jaroslav Pokorny NoSQL databases: a step to database scalability in web environment , 2011, iiWAS '11.

[103]  Chaogui Kang,et al.  Incorporating spatial interaction patterns in classifying and understanding urban land use , 2016, Int. J. Geogr. Inf. Sci..

[104]  Betul Karakus,et al.  Architecture and Implementation of a Scalable Sensor Data Storage and Analysis System Using Cloud Computing and Big Data Technologies , 2015, J. Sensors.

[105]  Hui Zhao,et al.  The Data Allocation Strategy Based on Load in NoSQL Database , 2014 .

[106]  Albert Y. Zomaya,et al.  CloudFlow: A data-aware programming model for cloud workflow applications on modern HPC systems , 2015, Future Gener. Comput. Syst..

[107]  Ching-Hsien Hsu,et al.  Locality and loading aware virtual machine mapping techniques for optimizing communications in MapReduce applications , 2015, Future Gener. Comput. Syst..

[108]  Bin Zhou,et al.  High-performance computing for the simulation of dust storms , 2010, Comput. Environ. Urban Syst..

[109]  Rini T. Kaushik,et al.  GreenHDFS: towards an energy-conserving, storage-efficient, hybrid Hadoop compute cluster , 2010 .

[110]  Keqin Li,et al.  A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications , 2015, Future Gener. Comput. Syst..

[111]  I. Bird Computing for the Large Hadron Collider , 2011 .

[112]  Miriam A. M. Capretz,et al.  Knowledge as a Service Framework for Disaster Data Management , 2013, 2013 Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises.

[113]  Wang Weihong,et al.  Secure big data storage and sharing scheme for cloud tenants , 2015, China Communications.

[114]  Qunying Huang,et al.  Utilize cloud computing to support dust storm forecasting , 2013, Int. J. Digit. Earth.

[115]  Taghi M. Khoshgoftaar,et al.  Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[116]  Chaowei Phil Yang,et al.  Redefining the possibility of digital Earth and geosciences with spatial cloud computing , 2013, Int. J. Digit. Earth.

[117]  Mateo Valero,et al.  New Benchmarking Methodology and Programming Model for Big Data Processing , 2015, Int. J. Distributed Sens. Networks.

[118]  Rong Gu,et al.  SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters , 2014, J. Parallel Distributed Comput..

[119]  HuangQunying,et al.  Activity patterns, socioeconomic status and urban spatial structure , 2016 .

[120]  Francisco Herrera,et al.  MRPR: A MapReduce solution for prototype reduction in big data classification , 2015, Neurocomputing.

[121]  Chih-Wei Huang,et al.  CloudDOE: A User-Friendly Tool for Deploying Hadoop Clouds and Analyzing High-Throughput Sequencing Data with MapReduce , 2014, PloS one.

[122]  Nello Cristianini,et al.  Tracking the flu pandemic by monitoring the social web , 2010, 2010 2nd International Workshop on Cognitive Information Processing.

[123]  Jian Pei,et al.  A spatiotemporal compression based approach for efficient big data processing on Cloud , 2014, J. Comput. Syst. Sci..

[124]  Yang Yang,et al.  K-Means Method for Grouping in Hybrid MapReduce Cluster , 2013, J. Comput..

[125]  Jennifer Widom,et al.  Challenges and Opportunities with Big Data 2011-1 , 2011 .

[126]  E. K. Karuppiah,et al.  Evaluation of virtual machine scalability on distributed multi/many-core processors for big data analytics , 2012, 2012 IEEE Conference on Open Systems.

[127]  Rajinder Sandhu,et al.  A commercial, benefit driven and secure framework for elearning in cloud computing , 2015, Comput. Appl. Eng. Educ..

[128]  Putchong Uthayopas,et al.  Economical and efficient big data sharing with i-Cloud , 2014, 2014 International Conference on Big Data and Smart Computing (BIGCOMP).

[129]  Kalina Bontcheva,et al.  GATECloud.net: a platform for large-scale, open-source text processing on the cloud , 2013, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[130]  David Maier,et al.  When big data leads to lost data , 2012, PIKM '12.

[131]  N. B. Anuar,et al.  The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[132]  Guihai Chen,et al.  Towards Parallel Spatial Query Processing for Big Spatial Data , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[133]  Xiaohua Tong,et al.  Rapid three-dimensional detection approach for building damage due to earthquakes by the use of parallel processing of unmanned aerial vehicle imagery , 2015 .

[134]  Gang Lu,et al.  CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications , 2012, Frontiers of Computer Science.

[135]  William J. Kettinger,et al.  Data Monetization: Lessons from a Retailer's Journey , 2013, MIS Q. Executive.

[136]  Chulyun Kim Theoretical analysis of constructing wavelet synopsis on partitioned data sets , 2014, Multimedia Tools and Applications.

[137]  Asunción Gómez-Pérez,et al.  Six challenges for the Semantic Web , 2002, KR 2002.

[138]  Mitsuyoshi Horl,et al.  Application of Cloud Computing to Agriculture and Prospects in Other Fields , 2010 .

[139]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[140]  Kees M. van Hee,et al.  Workflow Management: Models, Methods, and Systems , 2002, Cooperative information systems.

[141]  A-Xing Zhu,et al.  Interactive visual cluster detection in large geospatial datasets based on dynamic density volume visualization , 2016 .

[142]  Michelle Cheatham,et al.  Privacy in the age of big data , 2015, 2015 International Conference on Collaboration Technologies and Systems (CTS).

[143]  Wenwen Li,et al.  The GEOSS clearinghouse high performance search engine , 2011, 2011 19th International Conference on Geoinformatics.

[144]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[145]  Mauro Iacono,et al.  Modeling performances of concurrent big data applications , 2015, Softw. Pract. Exp..

[146]  Mohammad Hammoud,et al.  Locality-Aware Reduce Task Scheduling for MapReduce , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[147]  Bernard Marr,et al.  Big Data: Using SMART Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance , 2015 .

[148]  Hui Zhao,et al.  MapReduce model-based optimization of range queries , 2012, 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery.

[149]  Yong Wang,et al.  Improving the performance of GIS polygon overlay computation with MapReduce for spatial big data processing , 2015, Cluster Computing.

[150]  Zhikui Chen,et al.  Distributed fuzzy c-means algorithms for big sensor data based on cloud computing , 2015, Int. J. Sens. Networks.

[151]  Xuyun Zhang,et al.  Privacy Preservation over Big Data in Cloud Systems , 2014 .

[152]  Ying Wah Teh,et al.  Time-series clustering - A decade review , 2015, Inf. Syst..

[153]  Rajiv Ranjan,et al.  Geographical information system parallelization for spatial big data processing: a review , 2016, Cluster Computing.

[154]  Eunmi Choi,et al.  An analysis of performance factors on Esper-based stream big data processing in a virtualized environment , 2014, Int. J. Commun. Syst..

[155]  Chaowei Yang Thinking and computing spatiotemporally to enable cloud computing and science discoveries , 2011, 2011 19th International Conference on Geoinformatics.

[156]  Hongli Zhang,et al.  Mobile cloud sensing, big data, and 5G networks make an intelligent and smart world , 2015, IEEE Network.

[157]  Mitchell M. Tseng,et al.  Design Considerations for Building Distributed Supply Chain Management Systems Based on Cloud Computing , 2015 .

[158]  Siti Hafizah Ab Hamid,et al.  Mobile storage augmentation in mobile cloud computing: Taxonomy, approaches, and open issues , 2015, Simul. Model. Pract. Theory.

[159]  David Meyre,et al.  From big data analysis to personalized medicine for all: challenges and opportunities , 2015, BMC Medical Genomics.

[160]  Elisa Bertino,et al.  Big Data for Open Digital Innovation - A Research Roadmap , 2015, Big Data Res..

[161]  Mohamed Nassar,et al.  RPig: A scalable framework for machine learning and advanced statistical functionalities , 2012, 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings.

[162]  Raja Lavanya,et al.  Fog Computing and Its Role in the Internet of Things , 2019, Advances in Computer and Electrical Engineering.

[163]  Michael Amberg,et al.  Data Processing Requirements of Industry 4.0 - Use Cases for Big Data Applications , 2015, ECIS.

[164]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[165]  Kevin Li,et al.  Faceted metadata for image search and browsing , 2003, CHI '03.

[166]  Erik Brynjolfsson,et al.  Big data: the management revolution. , 2012, Harvard business review.

[167]  J. Alberto Espinosa,et al.  Big Data: Issues and Challenges Moving Forward , 2013, 2013 46th Hawaii International Conference on System Sciences.

[168]  Rob van den Dam Internet of Things: The Foundational Infrastructure for a Smarter Planet , 2013, NEW2AN.

[169]  Ivor W. Tsang,et al.  The Emerging "Big Dimensionality" , 2014, IEEE Computational Intelligence Magazine.

[170]  Jignesh M. Patel,et al.  Big data and its technical challenges , 2014, CACM.

[171]  Ming-Hsiang Tsou Big data: techniques and technologies in geoinformatics , 2014, Ann. GIS.

[172]  Roberto De Virgilio,et al.  Implementing BFS-based Traversals of RDF Graphs over MapReduce Efficiently , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[173]  Naphtali Rishe,et al.  Leveraging Cloud Computing in Geodatabase Management , 2010, 2010 IEEE International Conference on Granular Computing.

[174]  Omer Tene,et al.  Privacy: The New Generations , 2011 .

[175]  Qingquan Li,et al.  Spatiotemporal data model for network time geographic analysis in the era of big data , 2016, Int. J. Geogr. Inf. Sci..

[176]  Mohammad Kazem Akbari,et al.  Survey on improved Autoscaling in Hadoop into cloud environments , 2013, The 5th Conference on Information and Knowledge Technology.

[177]  Ahmad Ghafarian,et al.  A Computer Forensics Approach Based on Autonomous Intelligent Multi-Agent System , 2013 .

[178]  Philip S. Yeager A DISTRIBUTED FILE SYSTEM FOR DISTRIBUTED CONFERENCING SYSTEM , 2003 .

[179]  Bartha Maria Knoppers,et al.  Human genetic research: emerging trends in ethics , 2006, Nature Reviews Genetics.

[180]  Minghua Chen,et al.  Moving Big Data to The Cloud: An Online Cost-Minimizing Approach , 2013, IEEE Journal on Selected Areas in Communications.

[181]  M. H. Padgavankar,et al.  Big Data Storage and Challenges , 2014 .

[182]  Michael F. Goodchild,et al.  Spatial cloud computing: how can the geospatial sciences use and help shape cloud computing? , 2011, Int. J. Digit. Earth.

[183]  Gilles Fedak,et al.  The Computational and Storage Potential of Volunteer Computing , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[184]  Rui Li,et al.  Adopting cloud computing to optimize spatial web portals for better performance to support Digital Earth and other global geospatial initiatives , 2015, Int. J. Digit. Earth.

[185]  Loo Hay Lee,et al.  Simulation Optimization: A Review and Exploration in the New Era of Cloud Computing and Big Data , 2015, Asia Pac. J. Oper. Res..

[186]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[187]  Junho Choi,et al.  Ontology-based access control model for security policy reasoning in cloud computing , 2013, The Journal of Supercomputing.

[188]  Athanasios V. Vasilakos,et al.  Big data: From beginning to future , 2016, Int. J. Inf. Manag..

[189]  Stefan Edlich,et al.  Future mobile access for open-data platforms and the BBC-DaaS system , 2013, Electronic Imaging.

[190]  Zhenlong Li,et al.  Automatic Scaling Hadoop in the Cloud for Efficient Process of Big Geospatial Data , 2016, ISPRS Int. J. Geo Inf..

[191]  Chunming Rong,et al.  Distributed Systems Combined with Advanced Network: Evolution, Applications and Challenges , 2012, 2012 8th International Conference on Wireless Communications, Networking and Mobile Computing.

[192]  Nrusimham Ammu,et al.  Big Data Challenges , 2013 .

[193]  Milton Halem,et al.  A MapReduce workflow system for architecting scientific data intensive applications , 2011, SECLOUD '11.

[194]  Michael E. Papka,et al.  Visualizing Large, Heterogeneous Data in Hybrid-Reality Environments , 2013, IEEE Computer Graphics and Applications.

[195]  Michel Krämer,et al.  A modular software architecture for processing of big geospatial data in the cloud , 2015, Comput. Graph..

[196]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..

[197]  Shaowen Wang,et al.  Parallel cartographic modeling: a methodology for parallelizing spatial data processing , 2016, Int. J. Geogr. Inf. Sci..

[198]  Qunying Huang,et al.  Optimizing grid computing configuration and scheduling for geospatial analysis: An example with interpolating DEM , 2011, Comput. Geosci..

[199]  Fabrice Dupros,et al.  Collaborative simulation and scientific big data analysis: Illustration for sustainability in natural hazards management and chemical process engineering , 2014, Comput. Ind..

[200]  Yadira Espinal Viktor Mayer-Schonberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work and Think , 2013 .

[201]  Xiaohong Jiang,et al.  vHadoop: A Scalable Hadoop Virtual Cluster Platform for MapReduce-Based Parallel Machine Learning with Performance Consideration , 2012, 2012 IEEE International Conference on Cluster Computing Workshops.

[202]  Peter Fox,et al.  Changing the Equation on Scientific Data Visualization , 2011, Science.

[203]  Feng-Cheng Lin,et al.  The Framework of Cloud Computing Platform for Massive Remote Sensing Images , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[204]  Sangsoo Kim,et al.  Smart-Contents Visualization of Publishing Big Data Using NFC Technology , 2012, FGIT-GDC/IESH/CGAG.

[205]  Zhenlong Li,et al.  Optimizing Geospatial Cyberinfrastructure to Improve the Computing Capability for Climate Studies , 2015 .

[206]  Jin Xing,et al.  A land use/land cover change geospatial cyberinfrastructure to integrate big data and temporal topology , 2016, Int. J. Geogr. Inf. Sci..

[207]  Steven Woolf,et al.  Holistics 3.0 for Health , 2014, ISPRS Int. J. Geo Inf..

[208]  Guan Le,et al.  Survey on NoSQL database , 2011, 2011 6th International Conference on Pervasive Computing and Applications.

[209]  S Rabindranath,et al.  A Time Efficient Approach for Detecting Errors in Big Sensor Data on Cloud , 2018 .

[210]  Ahmed Eldawy,et al.  A Demonstration of SpatialHadoop: An Efficient MapReduce Framework for Spatial Data , 2013, Proc. VLDB Endow..

[211]  Pieter Abbeel,et al.  Image Object Label 3 D CAD Model Candidate Grasps Google Object Recognition Engine Google Cloud Storage Select Feasible Grasp with Highest Success Probability Pose EstimationCamera Robots Cloud 3 D Sensor , 2014 .

[212]  Matthew Zook,et al.  Social Media and the City: Rethinking Urban Socio-Spatial Inequality Using User-Generated Geographic Information , 2015 .

[213]  Xiao Xue,et al.  Reliable Web service composition based on QoS dynamic prediction , 2015, Soft Comput..

[214]  Xiaohui Gu,et al.  CloudScale: elastic resource scaling for multi-tenant cloud systems , 2011, SoCC.

[215]  Gunaseelan Arivarignan,et al.  Mathematics in business management , 2015, Ann. Oper. Res..

[216]  Keqin Li,et al.  Re-Stream: Real-time and energy-efficient resource scheduling in big data stream computing environments , 2015, Inf. Sci..

[217]  Alexandros Labrinidis,et al.  Challenges and Opportunities with Big Data , 2012, Proc. VLDB Endow..

[218]  Minoru Uehara,et al.  Proposal for Cloud Search Engine as a Service , 2012, 2012 15th International Conference on Network-Based Information Systems.

[219]  Qunying Huang,et al.  DisasterMapper: A CyberGIS framework for disaster management using social media data , 2015, BigSpatial@SIGSPATIAL.

[220]  Wolfgang Lehner,et al.  SAP HANA database: data management for modern business applications , 2012, SGMD.

[221]  Laurence T. Yang,et al.  A nodes scheduling model based on Markov chain prediction for big streaming data analysis , 2015, Int. J. Commun. Syst..

[222]  Ioannis Chatzigiannakis,et al.  Developing an IoT Smart City framework , 2013, IISA 2013.

[223]  Ciprian Dobre,et al.  Deadline scheduling for aperiodic tasks in inter-Cloud environments: a new approach to resource management , 2015, The Journal of Supercomputing.

[224]  Hai Jiang,et al.  Scaling up MapReduce-based Big Data Processing on Multi-GPU systems , 2014, Cluster Computing.

[225]  Yongtae Shin,et al.  Cloud Computing Availability: Multi-clouds for Big Data Service , 2012, ICHIT.

[226]  Peter Christen Privacy Aspects in Big Data Integration: Challenges and Opportunities , 2014, PSBD '14.

[227]  Paul Zikopoulos,et al.  Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data , 2011 .

[228]  Vivekanand Gopalkrishnan,et al.  Big data, big business: bridging the gap , 2012, BigMine '12.

[229]  Luis M. Camarinha-Matos,et al.  Technological Innovation for the Internet of Things , 2013, IFIP Advances in Information and Communication Technology.

[230]  Xuyun Zhang,et al.  Using arced axes in parallel coordinates geometry for high dimensional BigData visual analytics in cloud computing , 2014, Computing.

[231]  Kenli Li,et al.  A self-adaptive scheduling algorithm for reduce start time , 2015, Future Gener. Comput. Syst..

[232]  Jérôme Dantan,et al.  The G.O.A.L. Approach , 2013, ENASE 2013.

[233]  Srinivasan Parthasarathy,et al.  Anomaly detection and spatio-temporal analysis of global climate system , 2009, SensorKDD '09.

[234]  Giancarlo Fortino,et al.  Managing Data and Processes in Cloud-Enabled Large-Scale Sensor Networks: State-of-the-Art and Future Research Directions , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[235]  Syed Akhter Hossain,et al.  NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison , 2013, ArXiv.

[236]  Jose María Álvarez Rodríguez,et al.  Semantic-based QoS management in cloud systems: Current status and future challenges , 2014, Future Gener. Comput. Syst..

[237]  A. Oguntimilehin,et al.  A Review of Big Data Management, Benefits and Challenges , 2014 .

[238]  Valentin Cristea,et al.  Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing , 2015, Future Gener. Comput. Syst..

[239]  Zhenlong Li,et al.  An optimized framework for seamlessly integrating OGC Web Services to support geospatial sciences , 2011, Int. J. Geogr. Inf. Sci..

[240]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .

[241]  Silvana Trimi,et al.  Big-data applications in the government sector , 2014, Commun. ACM.

[242]  Advances in Very-High-Resolution Remote Sensing , .

[243]  Ryosuke Shibasaki,et al.  The Design of Large Scale Data Management for Spatial Analysis on Mobile Phone Dataset , 2013 .

[244]  D. K. Branstad,et al.  Data Encryption Standard: past and future , 1988, Proc. IEEE.

[245]  Simon Wilson,et al.  weather@home—development and validation of a very large ensemble modelling system for probabilistic event attribution , 2015 .

[246]  Chitra Balakrishna,et al.  Enabling Technologies for Smart City Services and Applications , 2012, 2012 Sixth International Conference on Next Generation Mobile Applications, Services and Technologies.

[247]  Hans Schaffers,et al.  Smart Cities and the Future Internet: Towards Cooperation Frameworks for Open Innovation , 2011, Future Internet Assembly.

[248]  Kai Liu,et al.  Using Semantic Search and Knowledge Reasoning to Improve the Discovery of Earth Science Records: An Example with the ESIP Semantic Testbed , 2014, Int. J. Appl. Geospat. Res..

[249]  Hans-Ulrich Prokosch,et al.  A scoping review of cloud computing in healthcare , 2015, BMC Medical Informatics and Decision Making.

[250]  N. Tapus,et al.  Practical application and evaluation of no-SQL databases in Cloud Computing , 2012, 2012 IEEE International Systems Conference SysCon 2012.

[251]  Dursun Delen,et al.  Leveraging the capabilities of service-oriented decision support systems: Putting analytics and big data in cloud , 2013, Decis. Support Syst..

[252]  Massimo Torquati,et al.  Parallel patterns for heterogeneous CPU/GPU architectures: Structured parallelism from cluster to cloud , 2014, Future Gener. Comput. Syst..

[253]  Ching-Hsien Hsu,et al.  An improved partitioning mechanism for optimizing massive data analysis using MapReduce , 2013, The Journal of Supercomputing.

[254]  Qunying Huang,et al.  Using adaptively coupled models and high-performance computing for enabling the computability of dust storm forecasting , 2013, Int. J. Geogr. Inf. Sci..

[255]  Dong Chen,et al.  Profiling, Quantifying, Modeling and Evaluating Green Service Level Objectives in Cloud Computing Environments: Profiling, Quantifying, Modeling and Evaluating Green Service Level Objectives in Cloud Computing Environments , 2014 .

[256]  Zhenlong Li,et al.  Building Model as a Service to support geosciences , 2017, Comput. Environ. Urban Syst..

[257]  David R. O'Hallaron,et al.  Tashi: location-aware cluster management , 2009, ACDC '09.

[258]  Joseph M. Hellerstein Datalog redux: experience and conjecture , 2010, PODS '10.

[259]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[260]  Marimuthu Palaniswami,et al.  An Information Framework for Creating a Smart City Through Internet of Things , 2014, IEEE Internet of Things Journal.

[261]  Qichang Chen,et al.  MRGIS: A MapReduce-Enabled High Performance Workflow System for GIS , 2008, 2008 IEEE Fourth International Conference on eScience.

[262]  Yoshinobu Tamura,et al.  Reliability Analysis Based on a Jump Diffusion Model with Two Wiener Processes for Cloud Computing with Big Data , 2015, Entropy.

[263]  Divyakant Agrawal,et al.  Big data and cloud computing: current state and future opportunities , 2011, EDBT/ICDT '11.

[264]  William Nick Street,et al.  Healthcare information systems: data mining methods in the creation of a clinical recommender system , 2011, Enterp. Inf. Syst..

[265]  Stefano Nativi,et al.  Big Data challenges in building the Global Earth Observation System of Systems , 2015, Environ. Model. Softw..

[266]  Peng Yue,et al.  An SDI Approach for Big Data Analytics: The Case on Sensor Web Event Detection and Geoprocessing Workflow , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[267]  Bryan C. Pijanowski,et al.  A big data urban growth simulation at a national scale: Configuring the GIS and neural network based Land Transformation Model to run in a High Performance Computing (HPC) environment , 2014, Environ. Model. Softw..

[268]  Bruno Simões,et al.  Big data through cross-platform interest-based interactivity , 2014, 2014 International Conference on Big Data and Smart Computing (BIGCOMP).

[269]  Elisa Bertino,et al.  WORKSHOP REPORT BIG DATA SECURITY AND PRIVACY Sponsored by the National Science Foundation , 2014 .

[270]  Marcos K. Aguilera,et al.  Consistency-based service level agreements for cloud storage , 2013, SOSP.

[271]  Klara Nahrstedt,et al.  Evaluation and Analysis of GreenHDFS: A Self-Adaptive, Energy-Conserving Variant of the Hadoop Distributed File System , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[272]  Samee Ullah Khan,et al.  Future Generation Computer Systems ( ) – Future Generation Computer Systems a Cloud Based Health Insurance Plan Recommendation System: a User Centered Approach , 2022 .

[273]  Kishor S. Trivedi,et al.  Combining Cloud and sensors in a smart city environment , 2012, EURASIP J. Wirel. Commun. Netw..

[274]  Åke Grönlund,et al.  Cloud computing: The beliefs and perceptions of Swedish school principals , 2015, Comput. Educ..

[275]  C. Dobre,et al.  A SLA-based method for big-data transfers with multi-criteria optimization constraints for IaaS , 2013, 2013 11th RoEduNet International Conference.

[276]  Jinjun Chen,et al.  A security framework in G-Hadoop for big data computing across distributed Cloud data centres , 2014, J. Comput. Syst. Sci..

[277]  Paulo Carreira,et al.  Energy Cloud: Real-Time Cloud-Native Energy Management System to Monitor and Analyze Energy Consumption in Multiple Industrial Sites , 2014, 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing.

[278]  Madhusudhan Govindaraju,et al.  Configuring a MapReduce Framework for Dynamic and Efficient Energy Adaptation , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[279]  Dongwoo Kang,et al.  Burstiness-aware I/O scheduler for MapReduce framework on virtualized environments , 2014, 2014 International Conference on Big Data and Smart Computing (BIGCOMP).

[280]  Xiaoyi Zhou,et al.  A Research on Smart Tourism Service Mechanism Based on Context Awareness , 2014, CIT 2014.

[281]  D. Butler Data, data everywhere... , 2005, Nature Structural &Molecular Biology.

[282]  Joseph M. Hellerstein,et al.  Boom analytics: exploring data-centric, declarative programming for the cloud , 2010, EuroSys '10.

[283]  Qunying Huang,et al.  Using spatial principles to optimize distributed computing for enabling the physical science discoveries , 2011, Proceedings of the National Academy of Sciences.

[284]  Ying Jiang,et al.  A Data Localization Algorithm for Distributing Column Storage System of Big Data , 2013 .

[285]  Yifan Bo,et al.  The Application of Cloud Computing and the Internet of Things in Agriculture and Forestry , 2011, 2011 International Joint Conference on Service Sciences.

[286]  Paul D. Manuel,et al.  A trust model of cloud computing based on Quality of Service , 2015, Ann. Oper. Res..

[287]  Lori M. Kaufman,et al.  Data Security in the World of Cloud Computing , 2009, IEEE Security & Privacy.

[288]  Brian David Johnson,et al.  Entertainment in the Age of Big Data , 2012, Proceedings of the IEEE.

[289]  Surya Nepal,et al.  An autonomic framework for enhancing the quality of data grid services , 2012, Future Gener. Comput. Syst..

[290]  Justin Grimmer,et al.  We Are All Social Scientists Now: How Big Data, Machine Learning, and Causal Inference Work Together , 2014, PS: Political Science & Politics.

[291]  Sun Da Profiling,Quantifying,Modeling and Evaluating Green Service Level Objectives in Cloud Computing Environments , 2013 .

[292]  Qunying Huang,et al.  Activity patterns, socioeconomic status and urban spatial structure: what can social media data tell us? , 2016, Int. J. Geogr. Inf. Sci..

[293]  Wendi Heinzelman,et al.  COMBAT: mobile-Cloud-based cOmpute/coMmunications infrastructure for BATtlefield applications , 2012, Defense, Security, and Sensing.

[294]  Rion Dooley,et al.  Life science data analysis workflow development using the bioextract server leveraging the iPlant collaborative cyberinfrastructure , 2015, Concurr. Comput. Pract. Exp..

[295]  Keqiu Li,et al.  Big Data Processing in Cloud Computing Environments , 2012, 2012 12th International Symposium on Pervasive Systems, Algorithms and Networks.

[296]  Young-Kuk Kim,et al.  An Intrusive Analyzer for Hadoop Systems Based on Wireless Sensor Networks , 2014, Int. J. Distributed Sens. Networks.

[297]  Yongzhao Zhan,et al.  Maximum Neighborhood Margin Discriminant Projection for Classification , 2014, TheScientificWorldJournal.

[298]  Taku A. Tokuyasu,et al.  Hypergraph visualization and enrichment statistics: how the EGAN paradigm facilitates organic discovery from big data , 2011, Electronic Imaging.

[299]  Jinjun Chen,et al.  A Time Efficient Approach for Detecting Errors in Big Sensor Data on Cloud , 2015, IEEE Transactions on Parallel and Distributed Systems.

[300]  Chaowei Yang,et al.  Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework , 2015, PloS one.

[301]  Hilo,et al.  THE ELEVENTH AND TWELFTH DATA RELEASES OF THE SLOAN DIGITAL SKY SURVEY: FINAL DATA FROM SDSS-III , 2015, 1501.00963.

[302]  Sandeep K. Sood,et al.  Scheduling of big data applications on distributed cloud based on QoS parameters , 2014, Cluster Computing.

[303]  Thomas S. Huang,et al.  Reconstructing Sessions from Data Discovery and Access Logs to Build a Semantic Knowledge Base for Improving Data Discovery , 2016, ISPRS Int. J. Geo Inf..

[304]  Kai Liu,et al.  Optimizing an index with spatiotemporal patterns to support GEOSS Clearinghouse , 2014, Int. J. Geogr. Inf. Sci..

[305]  Parijat Dube,et al.  Autoscaling for Hadoop Clusters , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).

[306]  C. Kesselman,et al.  A Metadata Catalog Service for Data Intensive Applications , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[307]  Pat Doody,et al.  Mining network relationships in the internet of things , 2012, Self-IoT '12.

[308]  Nabil Sultan,et al.  loud computing for education : A new dawn ? , 2009 .

[309]  Rajkumar Buyya,et al.  Big Data computing and clouds: Trends and future directions , 2013, J. Parallel Distributed Comput..

[310]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[311]  Dengguo Feng,et al.  Study on Cloud Computing Security: Study on Cloud Computing Security , 2011 .

[312]  Erik G. Hoel,et al.  Spatial indexing and analytics on Hadoop , 2014, SIGSPATIAL/GIS.

[313]  Kai Liu,et al.  Forming a global monitoring mechanism and a spatiotemporal performance model for geospatial services , 2015, Int. J. Geogr. Inf. Sci..

[314]  Jordi Torres,et al.  GreenHadoop: leveraging green energy in data-processing frameworks , 2012, EuroSys '12.

[315]  Bin Zhou,et al.  Performance improvement techniques for geospatial web services in a cyberinfrastructure environment - A case study with a disaster management portal , 2015, Comput. Environ. Urban Syst..

[316]  Edd Dumbill,et al.  A Revolution That Will Transform How We Live, Work, and Think: An Interview with the Authors of Big Data , 2013, Big Data.

[317]  R. Rust,et al.  IT-Related Service , 2013 .

[318]  Angappa Gunasekaran,et al.  Education and training for successful career in Big Data and Business Analytics , 2015 .