Series in Informatics Optimistic Aborts for Geo-distributed Transactions

Network latency can have a significant impact on the performance of transactional storage systems, particularly in wide area or geo-distributed deployments. To reduce latency, systems typically rely on a cache to service read-requests closer to the client. However, caches are not effective for write-heavy workloads, which have to be processed by the storage system in order to maintain serializability. This paper presents a new technique, called optimistic abort, which reduces network latency for high-contention, write-heavy workloads by identifying transactions that will abort as early as possible, and aborting them before they reach the store. We have implemented optimistic abort in a system called Gotthard, which leverages recent advances in network data plane programmability to execute transaction processing logic directly in network devices. Gotthard examines network traffic to observe and log transaction requests. If Gotthard suspects that a transaction is likely to be aborted at the store, it aborts the transaction early by re-writing the packet header, and routing the packets back to the client. Gotthard significantly reduces the overall latency and improves the throughput for high-contention workloads. Report Info Published Number USI-INF-TR-2016-05 Institution Faculty of Informatics Università della Svizzera italiana Lugano, Switzerland Online Access www.inf.usi.ch/techreports

[1]  J. T. Robinson,et al.  On optimistic methods for concurrency control , 1979, TODS.

[2]  Jerome H. Saltzer,et al.  End-to-end arguments in system design , 1984, TOCS.

[3]  Marc Shapiro,et al.  Structure and Encapsulation in Distributed Systems: The Proxy Principle , 1986, ICDCS.

[4]  Barbara Liskov,et al.  Viewstamped Replication: A General Primary Copy , 1988, PODC.

[5]  M. Frans Kaashoek,et al.  Rover: a toolkit for mobile information access , 1995, SOSP.

[6]  Jeanna Neefe Matthews,et al.  Serverless network file systems , 1996, TOCS.

[7]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[8]  Srinivasan Seshan,et al.  A network architecture for heterogeneous mobile computing , 1998, IEEE Wirel. Commun..

[9]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[10]  Vivek S. Pai,et al.  The effectiveness of request redirection on CDN robustness , 2002, OSDI '02.

[11]  Jeffrey C. Mogul,et al.  Architecture and performance of server-directed transcoding , 2003, TOIT.

[12]  David Mazières,et al.  Democratizing Content Publication with Coral , 2004, NSDI.

[13]  Wei Hong,et al.  A macroscope in the redwoods , 2005, SenSys '05.

[14]  R. Cardell-Oliver,et al.  ROPE: a reactive, opportunistic protocol for environment monitoring sensor networks , 2005, The Second IEEE Workshop on Embedded Networked Sensors, 2005. EmNetS-II..

[15]  Matt Welsh,et al.  Fidelity and yield in a volcano monitoring sensor network , 2006, OSDI '06.

[16]  Robert Grimm,et al.  Na Kika: Secure Service Execution and Composition in an Open Edge-Side Computing Network , 2006, NSDI.

[17]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[18]  Pat Helland,et al.  Building on Quicksand , 2009, CIDR.

[19]  Nigel Ellis,et al.  Extreme scale with full SQL language support in microsoft SQL Azure , 2010, SIGMOD Conference.

[20]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[21]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[22]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[23]  Marcos K. Aguilera,et al.  Transactional storage for geo-replicated systems , 2011, SOSP.

[24]  V. Milutinovic,et al.  A survey of military applications of wireless sensor networks , 2012, 2012 Mediterranean Conference on Embedded Computing (MECO).

[25]  Haoyu Song,et al.  Protocol-oblivious forwarding: unleash the power of SDN through a future-proof forwarding plane , 2013, HotSDN '13.

[26]  Chen Liang,et al.  Participatory networking: an API for application control of SDNs , 2013, SIGCOMM.

[27]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[28]  Albert G. Greenberg,et al.  EyeQ: Practical Network Performance Isolation at the Edge , 2013, NSDI.

[29]  George Varghese,et al.  Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN , 2013, SIGCOMM.

[30]  Michael J. Freedman,et al.  Stronger Semantics for Low-Latency Geo-Replicated Storage , 2013, NSDI.

[31]  Fernando Pedone,et al.  Geo-replicated storage with scalable deferred update replication , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[32]  Brian F. Cooper Spanner: Google's globally-distributed database , 2013, SYSTOR '13.

[33]  Tim Kraska,et al.  MDCC: multi-data center consistency , 2012, EuroSys '13.

[34]  Tim Kraska,et al.  PLANET: making progress with commit processing in unpredictable environments , 2014, SIGMOD Conference.

[35]  Gordon J. Brebner,et al.  High-Speed Packet Processing using Reconfigurable Computing , 2014, IEEE Micro.

[36]  George Varghese,et al.  P4: programming protocol-independent packet processors , 2013, CCRV.

[37]  Fernando Pedone,et al.  Merlin: A Language for Provisioning Network Resources , 2014, CoNEXT.

[38]  John K. Ousterhout,et al.  In Search of an Understandable Consensus Algorithm , 2014, USENIX ATC.

[39]  Alexander L. Wolf,et al.  NetAgg: Using Middleboxes for Application-specific On-path Aggregation in Data Centres , 2014, CoNEXT.

[40]  Ali Ghodsi,et al.  Scalable atomic visibility with RAMP transactions , 2014, SIGMOD Conference.

[41]  Divyakant Agrawal,et al.  Minimizing Commit Latency of Transactions in Geo-Replicated Data Stores , 2015, SIGMOD Conference.

[42]  Fernando Pedone,et al.  NetPaxos: consensus at network speed , 2015, SOSR.

[43]  George Varghese,et al.  Compiling Packet Programs to Reconfigurable Switches , 2015, NSDI.

[44]  Fernando Pedone,et al.  Paxos Made Switch-y , 2015, CCRV.

[45]  Gustavo Alonso,et al.  Consensus in a Box: Inexpensive Coordination in Hardware , 2016, NSDI.

[46]  Xiaozhou Li,et al.  Be Fast, Cheap and in Control with SwitchKV , 2016, NSDI.