Web Proxy Auto Discovery for the WLCG
暂无分享,去创建一个
All four of the LHC experiments depend on web proxies (that is, squids) at each grid site to support software distribution by the CernVM FileSystem (CVMFS). CMS and ATLAS also use web proxies for conditions data distributed through the Frontier Distributed Database caching system. ATLAS & CMS each have their own methods for their grid jobs to find out which web proxies to use for Frontier at each site, and CVMFS has a third method. Those diverse methods limit usability and flexibility, particularly for opportunistic use cases, where an experiment’s jobs are run at sites that do not primarily support that experiment. This paper describes a new Worldwide LHC Computing Grid (WLCG) system for discovering the addresses of web proxies. The system is based on an internet standard called Web Proxy Auto Discovery (WPAD). WPAD is in turn based on another standard called Proxy Auto Configuration (PAC). Both the Frontier and CVMFS clients support this standard. The input into the WLCG system comes from squids registered in the ATLAS Grid Information System (AGIS) and CMS SITECONF files, cross-checked with squids registered by sites in the Grid Configuration Database (GOCDB) and the OSG Information Management (OIM) system, and combined with some exceptions manually configured by people from ATLAS and CMS who operate WLCG Squid monitoring. WPAD servers at CERN respond to http requests from grid nodes all over the world with a PAC file that lists available web proxies, based on IP addresses matched from a database that contains the IP address ranges registered to organizations. Large grid sites are encouraged to supply their own WPAD web servers for more flexibility, to avoid being affected by short term long distance network outages, and to offload the WLCG WPAD servers at CERN. The CERN WPAD servers additionally support requests from jobs running at non-grid sites (particularly for LHC@Home) which they direct to the nearest publicly accessible web proxy servers. The responses to those requests are geographically ordered based on a separate database that maps IP addresses to longitude and latitude.
[1] Predrag Buncic,et al. Status and future perspectives of CernVM-FS , 2012 .
[2] Dave Dykstra,et al. Greatly improved cache update times for conditions data with Frontier/Squid , 2009 .
[3] Randall J. Sobie,et al. Dynamic web cache publishing for IaaS clouds using Shoal , 2013, ArXiv.
[4] Dave Dykstra. Scaling HEP to Web Size with RESTful Protocols: The Frontier Example , 2011 .