In‐place query driven big data platform: Applications to post processing of environmental monitoring

This paper describes the use of an experimental big data platform for applications of environmental monitoring associated with visualization of global climate forecast data and air quality model simulation and response. Environmental monitoring in general requires both capabilities of model simulation for forecast, and data processing for visualization and analyses. The in‐place query driven big data platform, based on concepts of Query Driven Visualization and shared‐nothing distributed database, thus is developed for the need. The system architecture of this experimental big data platform entails one master data node and 17 slave data nodes, while the system links to the National Center for High‐performance Computing supercomputer, Advanced Large‐scale Parallel Supercluster, and storage pool. For software implementation, the openSUSE operating system and MariaDB database are installed on all nodes. The master data node is responsible for metadata management and information integration and the 17 slave data nodes for distributed database and parallel model simulation, data visualization, and analyses. The application of global climate data visualization (Outgoing Longwave Radiation or OLR, temperature, rainfall, etc.) in the platform serves first to partition Network Common Data Form file data into shared‐nothing distributed databases for partial visualization in slave data nodes, then integrated into whole visualization in the master node through Message Passing Interface communication.

[1]  Kwan-Liu Ma In situ visualization at extreme scale: challenges and opportunities. , 2009, IEEE computer graphics and applications.

[2]  T. Müseler A survey of Shared-Nothing Parallel Database Management Systems [ Comparison between Teradata , Greenplum and Netezza implementations ] , 2012 .

[3]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4]  Karsten Schwan,et al.  FlexQuery: An online query system for interactive remote visual data exploration at large scale , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).

[5]  Chien-Lung Chen,et al.  Determining aerodynamic roughness using tethersonde and heat flux measurements in an urban area over a complex terrain , 2003 .

[6]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[7]  David Maier,et al.  Algebraic manipulation of scientific datasets , 2004, The VLDB Journal.

[8]  B. Tsuang,et al.  Cool‐skin simulation by a one‐column ocean model , 2005 .

[9]  John Shalf,et al.  Query-driven visualization of large data sets , 2005, VIS 05. IEEE Visualization, 2005..

[10]  Ben-Jei Tsuang,et al.  Quantification on the source/receptor relationship of primary pollutants and secondary aerosols by a Gaussian plume trajectory model: Part II. Case study , 2003 .

[11]  Kenneth I. Joy,et al.  Query-Driven Visualization of Time-Varying Adaptive Mesh Refinement Data , 2008, IEEE Transactions on Visualization and Computer Graphics.