Brane: A Framework for Programmable Orchestration of Multi-Site Applications

Regardless of the context and rationale, running distributed applications on geographically dispersed IT resources often comes with various technical and organizational challenges. If not addressed appropriately, these challenges may impede development, and in turn, scientific and business innovation. We have developed the Brane framework to support implementers in addressing these challenges. Brane utilizes containerization to encapsulate functionalities as portable building blocks. Through programmability, application orchestration can be expressed using an intuitive domain-specific language. As a result, end-users with limited programming experience are empowered to compose applications by themselves, without having to deal with the underlying technical details. They can do this from user-friendly interactive notebooks. In this paper, we introduce Brane, describe its components and features, and validate the framework with an implementation of a real-world scientific use case.

[1]  Mary Goldman,et al.  Toil enables reproducible, open source, big biomedical data analyses , 2017, Nature Biotechnology.

[2]  A. Belloum,et al.  PROCESS Data Infrastructure and Data Services , 2020, Comput. Informatics.

[3]  Jeffrey M. Perkel,et al.  Why Jupyter is data scientists’ computational notebook of choice , 2018, Nature.

[4]  Letizia Tanca,et al.  Exploratory computing: a comprehensive approach to data sensemaking , 2017, International Journal of Data Science and Analytics.

[5]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[6]  Rizos Sakellariou,et al.  A characterization of workflow management systems for extreme-scale applications , 2016, Future Gener. Comput. Syst..

[7]  Adam Belloum,et al.  Cookery: A Framework for Creating Data Processing Pipeline Using Online Services , 2018, 2018 IEEE 14th International Conference on e-Science (e-Science).

[8]  Jason Maassen,et al.  Unlocking the LOFAR LTA , 2019, 2019 15th International Conference on eScience (eScience).

[9]  Xiaoqian Jiang,et al.  Privacy Preserving Federated Big Data Analysis , 2018 .

[10]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[11]  Dana Petcu,et al.  Multi-Cloud: expectations and current approaches , 2013, MultiCloud '13.

[12]  Jorge Pérez,et al.  Semantics and Complexity of GraphQL , 2018, WWW.

[13]  David Bernstein,et al.  Containers and Cloud: From LXC to Docker to Kubernetes , 2014, IEEE Cloud Computing.

[14]  John Chilton,et al.  Common Workflow Language, v1.0 , 2016 .

[15]  Miklos A. Vasarhelyi,et al.  Process Mining of Event Logs in Auditing: Opportunities and Challenges , 2010 .

[16]  Andrea C. Arpaci-Dusseau,et al.  Serverless Computation with OpenLambda , 2016, HotCloud.

[17]  Gennaro Cordasco,et al.  Toward a domain-specific language for scientific workflow-based applications on multicloud system , 2020 .

[18]  Olivier Barais,et al.  A principled approach to REPL interpreters , 2020, Onward!.

[19]  Fabrizio Montesi,et al.  Microservices: Yesterday, Today, and Tomorrow , 2017, Present and Ulterior Software Engineering.

[20]  Reginald Cushing,et al.  Towards a New Paradigm for Programming Scientific Workflows , 2019, 2019 15th International Conference on eScience (eScience).

[21]  Vanessa Sochat,et al.  Singularity: Scientific containers for mobility of compute , 2017, PloS one.

[22]  Marta Mattoso,et al.  A Survey of Data-Intensive Scientific Workflow Management , 2015, Journal of Grid Computing.