Dataflow Management: A Grand Challenge in Multiscale Materials Modelling

Workflows for multiscale simulation include individual steps exchanging model data via so-called dataflows. The interoperability of the workflow steps is strongly limited by the inherent data heterogeneity and their mutually incompatible input and output formats. We solve this data interoperability challenge following an approach based on a data model and a layered architecture to separate the application realm from the storage formats. This abstraction enables seamless embedding of the individual simulation codes and thus helps workflow designers. We report on our progress implementing this concept by using the Chemical Markup Language standard and extending the OpenMolGRID library. Employing the solution in a proof-of-principle simulation of organic light-emitting diodes, we demonstrate its relevance for multiscale materials modelling for industrial deployment and suggest further exploitation measures.

[1]  Rasmus H. Fogh,et al.  Structure Simulation with Calculated NMR Parameters - Integrating COSMOS into the CCPN Framework , 2012, HealthGrid.

[2]  Bernd Schuller,et al.  Enhancing UNICORE Storage Management Using Hadoop Distributed File System , 2009, Euro-Par Workshops.

[3]  Mathilde Romberg,et al.  The Chemomentum Data Services - A Flexible Solution for Data Handling in UNICORE , 2008, Euro-Par Workshops.

[4]  Jay Vyas,et al.  CONNJUR spectrum translator: an open source application for reformatting NMR spectral data , 2011, Journal of biomolecular NMR.

[5]  Tim J. Stevens,et al.  MEMOPS: Data modelling and automatic code generation , 2010, J. Integr. Bioinform..

[6]  Wayne Boucher,et al.  The CCPN data model for NMR spectroscopy: Development of a software pipeline , 2005, Proteins.

[7]  Bernd Schuller,et al.  The UNICORE Rich Client: Facilitating the Automated Execution of Scientific Workflows , 2010, 2010 IEEE Sixth International Conference on e-Science.

[8]  Bernd Schuller,et al.  OpenMolGRID: Using Automated Workflows in GRID Computing Environment , 2005, EGC.

[9]  Morris Riedel,et al.  GridBeans: Support e-Science and Grid Applications , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[10]  Uko Maran,et al.  Docking and Virtual Screening Using Distributed Grid Technology , 2009 .

[11]  Krzysztof Benedyczak,et al.  Distributed Storage Management Service in UNICORE , 2011 .

[12]  Björn Hagemeier,et al.  UNICORE 6 — Recent and Future Advancements , 2010, Ann. des Télécommunications.

[13]  J. B. Lister,et al.  A universal access layer for the Integrated Tokamak Modelling Task Force , 2008 .

[14]  Daniel S. Katz,et al.  Abstractions for Distributed Systems (DPA 2008) , 2008, Euro-Par Workshops.

[15]  Michael Alexander,et al.  Euro-Par 2009 – Parallel Processing Workshops: HPPC, HeteroPar, PROPER, ROIA, UNICORE, VHPC, Delft, The Netherlands, August 25-28, 2009, Revised Selected Papers , 2010, Euro-Par Workshops.

[16]  Bernd Schuller,et al.  Grid-enabled data warehousing for molecular engineering , 2004, Parallel Comput..