An Autonomous Data Structure for Brute Force Calculations in the Cloud

Commercial cloud systems allow massively parallel execution of a computing task for little money. We want to exploit this economic opportunity by solving classical problems in Operations Research through complete enumeration, especially if these problems can be expressed as integer programming problems. We propose and evaluate here a data structure, Scalable Virtual Distributed Hashing, that autonomously extends the computing task over as many nodes as are needed in order return a result within a time limit set by the user. Our data structure deals with varying and changing node capacities and the effects of node failures. It is modeled after Scalable Distributed Data Structures and Extendible Hashing in particular.

[1]  G. Nemhauser,et al.  Integer Programming , 2020 .

[2]  Bernadette Charron-Bost,et al.  On the impossibility of group membership , 1996, PODC '96.

[3]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[4]  Meiyappan Nagappan,et al.  Modeling cloud failure data: a case study of the virtual computing lab , 2011, SECLOUD '11.

[5]  Kenneth P. Birman,et al.  Guide to Reliable Distributed Systems: Building High-Assurance Applications and Cloud-Hosted Services , 2012 .

[6]  John K. Ousterhout,et al.  In Search of an Understandable Consensus Algorithm , 2014, USENIX Annual Technical Conference.

[7]  Witold Litwin,et al.  LH*RS---a highly-available scalable distributed data structure , 2005, TODS.

[8]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[9]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[10]  Randy H. Katz,et al.  How Hadoop Clusters Break , 2013, IEEE Software.

[11]  Kashi Venkatesh Vishwanath,et al.  Characterizing cloud computing hardware reliability , 2010, SoCC '10.

[12]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[13]  Witold Litwin,et al.  LH*—a scalable, distributed data structure , 1996, TODS.

[14]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[15]  Thomas Schwarz,et al.  Top k Knapsack Joins and Closure Preliminary Results of On-Going Investigation , 2009 .

[16]  Jing Zhu,et al.  SOMO: Self-Organized Metadata Overlay for Resource Management in P2P DHT , 2003, IPTPS.

[17]  Kenneth P. Birman,et al.  The process group approach to reliable distributed computing , 1992, CACM.

[18]  Sriram Sankar,et al.  Datacenter Scale Evaluation of the Impact of Temperature on Hard Disk Drive Failures , 2013, TOS.

[19]  Ronald Fagin,et al.  Extendible hashing—a fast access method for dynamic files , 1979, ACM Trans. Database Syst..

[20]  Witold Litwin,et al.  RP*: A Family of Order Preserving Scalable Distributed Data Structures , 1994, VLDB.

[21]  Sushil Jajodia,et al.  Scalable Distributed Virtual Data Structures , 2014 .