I/O Parallelization for the Goddard Earth Observing System Data Assimilation System (GEOS DAS)
暂无分享,去创建一个
The National Aeronautics and Space Administration (NASA) Data Assimilation Office (DAO) at the Goddard Space Flight Center (GSFC) has developed the GEOS DAS, a data assimilation system that provides production support for NASA missions and will support NASA's Earth Observing System (EOS) in the coming years. The DAO's support of the EOS project along with the requirement of producing long-term reanalysis datasets with an unvarying system levy a large I/O burden on the future system. The DAO has been involved in prototyping parallel implementations of the GEOS DAS for a number of years and is now converting the production version from shared-memory parallelism to distributed-memory parallelism using the portable Message-Passing Interface (MPI). If the MPI-based GEOS DAS is to meet these production requirements, we must make I/O from the parallel system efficient. We have designed a scheme that allows efficient I/O processing while retaining portability, reducing the need for post-processing, and producing data formats that are required by our users, both internal and external. The first phase of the GEOS DAS Parallel I/O System (GPIOS) will expand upon the common method of gathering global data to a Single PE for output. Instead of using a PE also tasked with primary computation, a number of PEs will be dedicated to I/O and its related tasks. This allows the data transformations and formatting required prior to output to take place asynchronously with respect to the GEOS DAS assimilation cycle, improving performance and generating output data sets in a format convenient for our users. I/O PEs can be added as needed to handle larger data volumes or to meet user file specifications. We will show I/O performance results from a prototype MPI GCM integrated with GPIOS. Phase two of GPIOS development will examine ways of integrating new software technologies to further improve performance and build scalability into the system. The maturing of MPI-IO implementations and other supporting libraries such as parallel HDF should provide performance gains while retaining portability.