An Efficient Co-processing Framework for Large-Scale Scientific Applications

As scientific applications like Computational Fluid Dynamics (CFD) simulations generate more and more data, co-processing becomes the most cost effective way to process the vast amount of data generated by these simulation. In a co-processing environment, analysis and/or visualization of intermediate results occur concurrently to the simulation itself. Improved efficiency and early insight into the simulation process and results are potential advantages in comparison to postprocessing, where analysis and/or visualization are performed after the completion of the simulation. To enable co-processing, however, intermediate data needs to be shared between simulation and data analysis, and some degree of coordination may be required to maintain the correctness of both simulation and data analysis. The overhead incurred to facilitate data sharing and coordination may well offset benefits gained, particularly where distributed, large-scale systems are involved as workload sharing, processor affinity and data locality introduce significant effects to the overall performance. In this paper, we propose a co-processing framework to address these issues. The empirical benchmarking results suggest that co-processing overhead tasks scale well with the system size, the overall gain of about 20% in turnaround time compared to post-processing and that the coprocessing framework allows simulation and data analysis task to scale up to their individual limits.