Domain Application of High Performance Computing in Earth Science: An Example of Dust Storm Modeling and Visualization

Earth science models often raise computational challenges, requiring a large number of computing resources, and serial computing using a single computer is not sufficient. Further, earth science datasets produced by observations and models are increasingly larger and complex, exceeding the limits of most analysis and visualization tools, as well as the capacities of a single computer. HPC enabled modeling, analysis, and visualization solutions are needed to better understand the behaviors, dynamics, and interactions of the complex earth system and its sub-systems. However, there are a wide range of computing paradigms (e.g., Cluster, Grid, GPU, Volunteer and Cloud Computing), and associated parallel programming standards and libraries (e.g., MPI/OpenMPI, CUDA, and MapReduce). In addition, the selection of specific HPC technologies varies widely for different datasets, computational models, and user requirements. To demystify the HPC technologies and unfold different computing options for scientists, this chapter first presents a generalized HPC architecture for earth science applications, and then demonstrates how such a generalized architecture can be instantiated to support the modeling and visualization of dust storms.

[1]  Kenneth A. Hawick,et al.  Distributed frameworks and parallel algorithms for processing large-scale geographic data , 2003, Parallel Comput..

[2]  Omer F. Rana,et al.  On‐demand transmission model for remote visualization using image‐based rendering , 2012, Concurr. Comput. Pract. Exp..

[3]  Hans-Peter Bunge,et al.  Cluster Design in the Earth Sciences Tethys , 2006, HPCC.

[4]  Scott Klasky,et al.  In Situ Methods, Infrastructures, and Applications on High Performance Computing Platforms , 2016, Comput. Graph. Forum.

[5]  Feng Xu,et al.  Survey of Research on Big Data Storage , 2013, 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science.

[6]  William Gropp,et al.  MPICH2: A New Start for MPI Implementations , 2002, PVM/MPI.

[7]  Justin L. Huntington,et al.  Climate Engine: Cloud Computing and Visualization of Climate and Remote Sensing Data for Advanced Natural Resource Monitoring and Process Understanding , 2017 .

[8]  Qunying Huang,et al.  Optimizing grid computing configuration and scheduling for geospatial analysis: An example with interpolating DEM , 2011, Comput. Geosci..

[9]  Cecelia DeLuca,et al.  The architecture of the Earth System Modeling Framework , 2003, Computing in Science & Engineering.

[10]  Kwan-Liu Ma,et al.  VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures , 2016, IEEE Computer Graphics and Applications.

[11]  Wenwu Tang,et al.  Parallel map projection of vector-based big spatial data: Coupling cloud computing with graphics processing units , 2017, Comput. Environ. Urban Syst..

[12]  Z. Janjic The Step-Mountain Eta Coordinate Model: Further Developments of the Convection, Viscous Sublayer, and Turbulence Closure Schemes , 1994 .

[13]  Prabhat,et al.  Ultrascale Visualization of Climate Data , 2013, Computer.

[14]  Utkarsh Ayachit,et al.  ParaView Catalyst: Enabling In Situ Data Analysis and Visualization , 2015, ISAV@SC.

[15]  Matthew T. Rice,et al.  Visualizing 3D/4D environmental data using many-core graphics processing units (GPUs) and multi-core central processing units (CPUs) , 2013, Comput. Geosci..

[16]  Francisco J. Doblas-Reyes,et al.  Finding, analysing and solving MPI communication bottlenecks in Earth System models , 2019, J. Comput. Sci..

[17]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[18]  Rajkumar Buyya,et al.  High-Performance Cloud Computing: A View of Scientific Applications , 2009, 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks.

[19]  Sujay V. Kumar,et al.  High-performance Earth system modeling with NASA/GSFC’s Land Information System , 2007, Innovations in Systems and Software Engineering.

[20]  Xian-He Sun,et al.  SciDP: Support HPC and Big Data Applications via Integrated Scientific Data Processing , 2018, 2018 IEEE International Conference on Cluster Computing (CLUSTER).

[21]  Cecelia DeLuca,et al.  Design and Implementation of Components in the Earth System Modeling Framework , 2005, Int. J. High Perform. Comput. Appl..

[22]  Roberto De Virgilio,et al.  Implementing BFS-based Traversals of RDF Graphs over MapReduce Efficiently , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[23]  Achim Streit,et al.  Enabling collaborative MapReduce on the Cloud with a single-sign-on mechanism , 2014, Computing.

[24]  Z. Janjic A nonhydrostatic model based on a new approach , 2002 .

[25]  G. Kallos,et al.  A model for prediction of desert dust cycle in the atmosphere , 2001 .

[26]  Clive F. Baillie,et al.  Regional Weather Modeling on Parallel Computers , 1997, Parallel Comput..

[27]  Arie Shoshani,et al.  The Earth System Grid: Supporting the Next Generation of Climate Modeling Research , 2005, Proceedings of the IEEE.

[28]  Shaowen Wang,et al.  TeraGrid GIScience Gateway: Bridging cyberinfrastructure and GIScience , 2009, Int. J. Geogr. Inf. Sci..

[29]  Zhenlong Li,et al.  A geospatial hybrid cloud platform based on multi-sourced computing and model resources for geosciences , 2018, Int. J. Digit. Earth.

[30]  Jian Huang,et al.  Data Mining in Earth System Science (DMESS 2011) , 2011, ICCS.

[31]  Rahul Ramachandran,et al.  Enabling Analytics in the Cloud for Earth Science Data , 2018 .

[32]  Wenwen Li,et al.  PolarGlobe: A web-wide virtual globe system for visualizing multidimensional, time-varying, big climate data , 2017, Int. J. Geogr. Inf. Sci..

[33]  Bin Zhou,et al.  High-performance computing for the simulation of dust storms , 2010, Comput. Environ. Urban Syst..

[34]  Qunying Huang,et al.  Utilize cloud computing to support dust storm forecasting , 2013, Int. J. Digit. Earth.

[35]  Hai Jiang,et al.  Scaling up MapReduce-based Big Data Processing on Multi-GPU systems , 2014, Cluster Computing.

[36]  Antonio J. Busalacchi,et al.  An Earth-system prediction initiative for the twenty-first century , 2010 .

[37]  Qing Liu,et al.  A cloud-enabled remote visualization tool for time-varying climate data analytics , 2016, Environ. Model. Softw..

[38]  Robert Latham,et al.  PVFS: a parallel file system , 2006, SC.

[39]  Zhenlong Li,et al.  Big Data and cloud computing: innovation opportunities and challenges , 2017, Int. J. Digit. Earth.

[40]  Anthony M. Castronova,et al.  Enabling Collaborative Numerical Modeling in Earth Sciences using Knowledge Infrastructure , 2019, Environ. Model. Softw..

[41]  Mariana Vertenstein,et al.  Computational performance of ultra-high-resolution capability in the Community Earth System Model , 2012, Int. J. High Perform. Comput. Appl..

[42]  Chulyun Kim Theoretical analysis of constructing wavelet synopsis on partitioned data sets , 2014, Multimedia Tools and Applications.

[43]  Qunying Huang,et al.  Using adaptively coupled models and high-performance computing for enabling the computability of dust storm forecasting , 2013, Int. J. Geogr. Inf. Sci..

[44]  David E. Bernholdt,et al.  The earth system grid: enabling access to multimodel climate simulation data. , 2009 .