Build less code deliver more science: an experience report on composing scientific environments using component-based and commodity software platforms

Modern scientific software is daunting in its diversity and complexity. From massively parallel simulations running on the world's largest supercomputers, to visualizations and user support environments that manage ever growing complex data collections, the challenges for software engineers are plentiful. While high performance simulators are necessarily specialized codes to maximize performance on specific supercomputer architectures, we argue the vast majority of supporting infrastructure, data management and analysis tools can leverage commodity open source and component-based technologies. This approach can significantly drive down the effort and costs of building complex, collaborative scientific user environments, as well as increase their reliability and extensibility. In this paper we describe our experiences in creating an initial user environment for scientists involved in modeling the detailed effects of climate change on the environment of selected geographical regions. Our approach composes the user environment using the Velo scientific knowledge management platform and the MeDICi Integration Framework for scientific workflows. These established platforms leverage component-based technologies and extend commodity open source platforms with abstractions and capabilities that make them amenable for broad use in science. Using this approach we were able to deliver an operational user environment capable of running thousands of simulations in a 7 month period, and achieve significant software reuse.

[1]  Shayne Flint,et al.  A survey of scientific software development , 2010, ESEM '10.

[2]  M. Wise,et al.  An integrated assessment of climate change and the accelerated introduction of advanced energy technologies , 1997 .

[3]  John R. Williams,et al.  The EPIC crop growth model , 1989 .

[4]  Adam Wynne,et al.  Components in the Pipeline , 2011, IEEE Software.

[5]  Christina Courtright,et al.  Context in information behavior research , 2007 .

[6]  M. A. Rashid The Evolution of ERP Systems: A Historical Perspective , 2002 .

[7]  Jeffrey C. Carver,et al.  Understanding the High-Performance-Computing Community: A Software Engineer's Perspective , 2008, IEEE Software.

[8]  Jeffrey C. Carver Development of a Mesh Generation Code with a Graphical Front-End: A Case Study , 2011, J. Organ. End User Comput..

[9]  Robert L. Young,et al.  SciNapse: a problem-solving environment for partial differential equations , 1997 .

[10]  James R. Rice,et al.  From Scientific Software Libraries to Problem Solving Environments John R. Rice , 1996 .

[11]  Karen Schuchardt,et al.  Ecce—a problem‐solving environment's evolution toward Grid services and a Web architecture , 2002, Concurr. Comput. Pract. Exp..

[12]  Daniel C. Stanzione,et al.  The iPlant Collaborative: Cyberinfrastructure to Feed the World , 2011, Computer.

[13]  Nenad Medvidovic,et al.  A software architecture-based framework for highly distributed and data intensive scientific applications , 2006, ICSE.

[14]  Steven Tuecke,et al.  Globus Online: Radical Simplification of Data Movement via SaaS , 2011 .

[15]  Ian Gorton,et al.  Velo: A Knowledge-Management Framework for Modeling and Simulation , 2012, Computing in Science & Engineering.

[16]  Allen D. Malony,et al.  An Open Domain-Extensible Environment for Simulation-Based Scientific Investigation (ODESSI) , 2009, ICCS.

[17]  Paul A. David,et al.  Towards a cyberinfrastructure for enhanced scientific collaboration: Providing its 'soft' foundations may be the hardest part , 2006 .

[18]  Michelle Miller,et al.  An integrated problem solving environment: the SCIRun computational steering system , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.

[19]  Ian T. Foster Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, NPC.

[20]  Ying-Hwa Kuo,et al.  Research Needs and Directions of Regional Climate Modeling Using WRF and CCSM , 2006 .

[21]  Michael McLennan,et al.  HUBzero: A Platform for Dissemination and Collaboration in Computational Science and Engineering , 2010, Computing in Science & Engineering.

[22]  Michael A. Heroux Software Challenges for Extreme Scale Computing: Going From Petascale to Exascale Systems , 2009, Int. J. High Perform. Comput. Appl..