Data management and simulation support accelerating carbon capture through computing

The Carbon Capture Simulation Initiative (CCSI) project has developed and deployed scientific infrastructure called the CCSI Toolset. The CCSI Toolset provides state-of-the-art computational modeling and simulation tools to accelerate the commercialization of carbon capture technologies from discovery to development, demonstration, and ultimately the widespread deployment to hundreds of power plants. Carbon capture technologies have the potential to dramatically reduce the carbon emissions from power plants. The CCSI Toolset provides end users in industry with a comprehensive, integrated suite of leading-edge, scientifically validated models with simulation, uncertainty quantification, optimization, risk analysis and decision making support. The CCSI Toolset has at its core an integrated framework that enables execution of simulations and workflows including optimization and uncertainty parameter sweeps using a wide variety of computing platforms including desktops, clusters, Clouds, and HPC systems. The integration framework enables the running of a variety of commercial process simulation packages as well as custom simulators. Moreover, the framework enables scientists to run and manage thousands of concurrent simulations to perform optimizations and uncertainty quantification. Components of the CCSI Toolset are connected through the use of a data management system that stores data to a repository and enables the tracking of provenance for each simulation as well as its associated components. The data management system tracks all the configurations, models, simulations, and results created during the design of a carbon capture system and supports the design life-cycle as well as decision making. The primary contribution of this paper is thus the design and implementation of the integration framework within the CCSI Toolset, which provides both data management and simulation support for CCSI. This integration framework has been deployed and is in use by several groups of researchers and commercial entities.

[1]  Cecilia R. Aragon,et al.  Using Visual Analytics to Develop Situation Awareness in Astrophysics , 2009, Inf. Vis..

[2]  Peter M. Kasson,et al.  GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit , 2013, Bioinform..

[3]  Arie Shoshani,et al.  A science data gateway for environmental management , 2016, Concurr. Comput. Pract. Exp..

[4]  T Maeno,et al.  PanDA: distributed production and distributed analysis system for ATLAS , 2008 .

[5]  Graeme Stewart,et al.  The ATLAS Distributed Data Management project: Past and Future , 2012 .

[6]  Juan C. Meza,et al.  Advanced Simulation Capability for Environmental Management (ASCEM): Early Site Demonstration , 2011 .

[7]  David J. DeWitt,et al.  Scientific data management in the coming decade , 2005, SGMD.

[8]  Thomas Ludwig,et al.  Improving Processes for User Support in e-Science , 2014, 2014 IEEE 10th International Conference on e-Science.

[9]  Gabriele D'Angelo,et al.  Parallel and Distributed Simulation from Many Cores to the Public Cloud (Extended Version) , 2011, ArXiv.

[10]  Gabriele D'Angelo,et al.  Parallel and distributed simulation from many cores to the public cloud , 2011, 2011 International Conference on High Performance Computing & Simulation.

[11]  Wei Chen,et al.  FireWorks: a dynamic workflow system designed for high‐throughput applications , 2015, Concurr. Comput. Pract. Exp..

[12]  Juan C. Meza,et al.  Advanced Simulation Capability for Environmental Management (ASCEM) , 2011 .

[13]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[14]  Rajiv Ranjan,et al.  G-Hadoop: MapReduce across distributed data centers for data-intensive computing , 2013, Future Gener. Comput. Syst..

[15]  Yogesh L. Simmhan,et al.  Provenance for Scientific Workflows Towards Reproducible Research , 2010, IEEE Data Eng. Bull..

[16]  Marta Mattoso,et al.  Provenance and Annotation of Data and Processes , 2016, Lecture Notes in Computer Science.

[17]  Asad Waqar Malik,et al.  Parallel and Distributed Simulation in the Cloud , 2010 .

[18]  R. Stuart Haszeldine,et al.  Carbon Capture and Storage: How Green Can Black Be? , 2009, Science.

[19]  Valerie Hendrix,et al.  Experiences with User-Centered Design for the Tigres Workflow API , 2014, 2014 IEEE 10th International Conference on e-Science.

[20]  Paul T. Groth,et al.  The Requirements of Using Provenance in e-Science Experiments , 2007, Journal of Grid Computing.

[21]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[22]  Wolfgang Gentzsch Linux Containers Simplify Engineering and Scientific Simulations in the Cloud , 2014, 2014 Annual Global Online Conference on Information and Computer Technology.