SciTokens: Capability-Based Secure Access to Remote Scientific Data

The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. In this paper, we introduce SciTokens, open source software to help scientists manage their security credentials more reliably and securely. We describe the SciTokens system architecture, design, and implementation addressing use cases from the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey Telescope (LSST) projects. We also present our integration with widely-used software that supports distributed scientific computing, including HTCondor, CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for capability-based secure access to remote scientific data. The access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems.

[1]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[2]  Ian T. Foster,et al.  Security for Grid services , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[3]  Ákos Frohner,et al.  VOMS, an Authorization System for Virtual Organizations , 2003, European Across Grids Conference.

[4]  Dick Hardt,et al.  The OAuth 2.0 Authorization Framework , 2012, RFC.

[5]  E.J. Whitehead,et al.  WEBDAV: IETF Standard for Collaborative Authoring on the Web , 1998, IEEE Internet Comput..

[6]  The LIGO Scientific Collaboration,et al.  GW150914: The Advanced LIGO Detectors in the Era of First Discoveries , 2016, 1602.03838.

[7]  Igor Sfiligoi,et al.  Flexible Session Management in a Distributed Environment , 2010, ArXiv.

[8]  The Ligo Scientific Collaboration,et al.  Observation of Gravitational Waves from a Binary Black Hole Merger , 2016, 1602.03837.

[9]  Brian Bockelman,et al.  Scitokens/Xrootd-Scitokens: Flexible Authorization Handling , 2018 .

[10]  D Huet,et al.  GW151226: Observation of Gravitational Waves from a 22-Solar-Mass Binary Black Hole Coalescence , 2016 .

[11]  Junwei Cao,et al.  A Case Study on the Use of Workflow Technologies for Scientific Analysis: Gravitational Wave Data Analysis , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[12]  Jeff Weber,et al.  Workflow Management in Condor , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[13]  Jeff Hodges,et al.  Assertions and Protocol for the OASIS Security Assertion Markup Language (SAML) V2. 0 , 2001 .

[14]  B. A. Boom,et al.  Binary Black Hole Mergers in the First Advanced LIGO Observing Run , 2016, 1606.04856.

[15]  Justin Richer,et al.  OAuth 2.0 Token Introspection , 2015, RFC.

[16]  D Huet,et al.  GW150914: The Advanced LIGO Detectors in the Era of First Discoveries. , 2016, Physical review letters.

[17]  William J. Dally Throughput computing , 2010, ICS '10.

[18]  Jim Basney,et al.  An OAuth service for issuing certificates to science gateways for TeraGrid users , 2011 .

[19]  Brian Bockelman,et al.  Data Access for LIGO on the OSG , 2017, PEARC.

[20]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[21]  Ian T. Foster,et al.  Globus auth: A research identity and access management platform , 2016, 2016 IEEE 12th International Conference on e-Science (e-Science).

[22]  Matthew West,et al.  The PyCBC search for gravitational waves from compact binary coalescence , 2015, 1508.02357.

[23]  Brian Bockelman,et al.  Accessing data federations with CVMFS , 2017 .

[24]  Armin Haller,et al.  Survey of Workflow Management Systems , 2014 .

[25]  Jim Basney,et al.  CILogon: a federated X.509 certification authority for cyberinfrastructure logon , 2013, XSEDE.

[26]  Miron Livny,et al.  Distributed computing in practice: the Condor experience: Research Articles , 2005 .

[27]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[28]  Will Reese,et al.  Nginx: the high-performance web server and reverse proxy , 2008 .

[29]  Von Welch,et al.  Reproducing GW150914: The First Observation of Gravitational Waves From a Binary Black Hole Merger , 2016, Computing in Science & Engineering.