Computational risk management for building highly reliable network services

Building reliable network services that can deliver consistent high performance to clients in the presence of failures and bursty demand is expensive and inefficient. Resources often need to be heavily overprovisioned to accommodate peak demand and the cost of such overprovisioning "prices out" many applications that could stand to benefit from a performance safety-net and ultimately provide more reliable service to end users. To address these problems, we propose an approach based on a shared Computational Service Provider (CSP). A CSP is an entity which provides massive amounts of widely distributed computation and storage and makes resources available through a mix of spot and derivative markets. Services obtain resources through the CSP and, drawing inspiration from finance, employ quantitative risk management techniques for trading off cost, performance, and risk to probabilistically achieve target levels of delivered client performance.

[1]  Sven de Vries,et al.  Combinatorial Auctions: A Survey , 2003, INFORMS J. Comput..

[2]  Anshul Kothar,et al.  Approximately-strategyproof and tractable multi-unit auctions , 2003 .

[3]  Gregory P. Hopper Value at risk: a new methodology for measuring portfolio risk , 1996 .

[4]  David E. Culler,et al.  A blueprint for introducing disruptive technology into the Internet , 2003, CCRV.

[5]  Noam Nisan,et al.  Bidding and allocation in combinatorial auctions , 2000, EC '00.

[6]  Amin Vahdat,et al.  Bootstrapping a Distributed Computational Economy with Peer-to-Peer Bartering , 2003 .

[7]  Jerome H. Saltzer,et al.  End-to-end arguments in system design , 1984, TOCS.

[8]  R. C. Merton,et al.  Theory of Rational Option Pricing , 2015, World Scientific Reference on Contingent Claims Analysis in Corporate Finance.

[9]  Prashant J. Shenoy,et al.  Resource overbooking and application profiling in shared hosting platforms , 2002, OSDI '02.

[10]  Ludmila Cherkasova,et al.  An SLA-oriented capacity planning tool for streaming media services , 2004, International Conference on Dependable Systems and Networks, 2004.

[11]  P. Druschel,et al.  A Resource Management Framework for Predictable Quality of Service in Web Servers , 2003 .

[12]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[13]  Noam Nisan,et al.  Computationally feasible VCG mechanisms , 2000, EC '00.

[14]  Chaki Ng,et al.  Mirage: a microeconomic resource allocation system for sensornet testbeds , 2005, The Second IEEE Workshop on Embedded Networked Sensors, 2005. EmNetS-II..

[15]  David C. Parkes,et al.  ICE: an iterative combinatorial exchange , 2005, EC '05.

[16]  Suman Nath,et al.  Beyond Availability: Towards a Deeper Understanding of Machine Failure Characteristics in Large Distributed Systems , 2004, WORLDS.

[17]  Margo I. Seltzer,et al.  Virtual worlds: fast and strategyproof auctions for dynamic resource allocation , 2003, EC '03.

[18]  F. Black,et al.  The Pricing of Options and Corporate Liabilities , 1973, Journal of Political Economy.

[19]  R. Rockafellar,et al.  Optimization of conditional value-at risk , 2000 .