Code à Effacement Mojette pour le Stockage Distribué. (Mojette Erasure Code for Distributed Storage)

Les codes a effacement permettent de generer de la redondance de donnees numeriques dans un systeme de stockage distribue. Cette redondance permet de restaurer une partie manquante des donnees en cas de panne. L’avantage des codes est de reduire considerablement la quantite de redondance generee par rapport aux techniques classiques de replication. Toutefois, cette reduction s’accompagne d’une complexite calculatoire significative, penalisant les performances d’encodage et de decodage, ce qui limite leur utilisation aux donnees froides. Dans cette these, nous nous interessons a l’utilisation de la transformation Mojette afin de fournir un code a effacement performant, adapte aux donnees chaudes. Le code qui en resulte necessite cependant plus de redondance par rapport aux codes classiques. La premiere contribution de ces travaux de these traite de la conception d’une version systematique du code a effacement Mojette. Cette version a l’avantage d’augmenter significativement les performances du code, tout en reduisant la quantite de redondance necessaire. La seconde contribution s’interesse a l’integration de cette solution au sein du systeme de fichiers distribue RozoFS. Cette contribution permet au systeme d’assurer un service continu en cas de panne, tout en etant capable de gerer les donnees chaudes avec deux fois moins de donnees par rapport aux systemes bases sur la replication. Un troisieme axe de recherche se focalise sur la conception d’une methode distribuee pour generer de nouveaux symboles de mots de code Mojette. Cette technique participe a la restauration d’un seuil de redondance du systeme de stockage.

[1]  Gordon Bell,et al.  The Mini and Micro Industries , 1984, Computer.

[2]  Jérôme Lacan,et al.  FEC4Cloud: a research project promoting erasure coding for Cloud storage architectures , 2015 .

[3]  Garth A. Gibson,et al.  DiskReduce: RAID for data-intensive scalable computing , 2009, PDSW '09.

[5]  Shekhar Chandra,et al.  An Exact, Non-iterative Mojette Inversion Technique Utilising Ghosts , 2008, DGCI.

[6]  Yunnan Wu,et al.  A Survey on Network Codes for Distributed Storage , 2010, Proceedings of the IEEE.

[7]  Richard Gordon,et al.  Questions of uniqueness and resolution in reconstruction from projection , 1978 .

[8]  Michael J. Flynn,et al.  Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.

[9]  Jérôme Idier,et al.  Conjugate gradient Mojette reconstruction , 2005, SPIE Medical Imaging.

[10]  Shekhar Chandra,et al.  A method for removing cyclic artefacts in discrete tomography using latin squares , 2008, 2008 19th International Conference on Pattern Recognition.

[11]  Mario Blaum,et al.  On Lowest Density MDS Codes , 1999, IEEE Trans. Inf. Theory.

[12]  John Gantz,et al.  The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East , 2012 .

[13]  Martin Hilbert,et al.  The World’s Technological Capacity to Store, Communicate, and Compute Information , 2011, Science.

[14]  Vincent Ricordel,et al.  ANALYSIS OF MOJETTE TRANSFORM PROJECTIONS FOR AN EFFICIENT CODING , 2004 .

[15]  Marek Karpinski,et al.  An XOR-based erasure-resilient coding scheme , 1995 .

[16]  Daniel A. Spielman,et al.  Practical loss-resilient codes , 1997, STOC '97.

[17]  Garth A. Gibson,et al.  RAID: high-performance, reliable secondary storage , 1994, CSUR.

[18]  Anne-Marie Kermarrec,et al.  Archiving cold data in warehouses with clustered network coding , 2014, EuroSys '14.

[19]  Herbert Bos,et al.  Modern operating systems 4th edition , 2015 .

[20]  Nicolas Normand,et al.  Inverse Mojette Transform Algorithms , 2009 .

[21]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[22]  Jeanpierre Guédon,et al.  CEDIMS: cloud ethical DICOM image Mojette storage , 2012, Other Conferences.

[23]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[24]  Jean-Guillaume Dumas,et al.  Théorie des Codes : compression, cryptage, correction , 2007 .

[25]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[26]  Emmanuel Lochin,et al.  On-the-Fly Erasure Coding for Real-Time Video Applications , 2011, IEEE Transactions on Multimedia.

[27]  Robert Hundt,et al.  Loop Recognition in C++/Java/Go/Scala , 2011 .

[28]  Mathieu Cunche,et al.  Codes AL-FEC hautes performances pour les canaux à effacements : variations autour des codes LDPC. (High performances AL-FEC codes for the erasure channel : variation around LDPC codes) , 2010 .

[29]  George Varghese,et al.  A Reliable and Scalable Striping Protocol , 1996, SIGCOMM.

[30]  Dimitri Pertin,et al.  Re-projection without Reconstruction , 2014 .

[31]  Lihao Xu,et al.  Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[32]  Scott A. Brandt,et al.  Ceph: reliable, scalable, and high-performance distributed storage , 2007 .

[33]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[34]  GhemawatSanjay,et al.  The Google file system , 2003 .

[35]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[36]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[37]  D. Donoho,et al.  Fast and accurate Polar Fourier transform , 2006 .

[38]  J. Franel,et al.  Les suites de Farey et le problème des nombres premiers , 1924 .

[39]  Dominique Barba,et al.  Controlled redundancy for image coding and high-speed transmission , 1996, Other Conferences.

[40]  Dave Evans,et al.  How the Next Evolution of the Internet Is Changing Everything , 2011 .

[41]  Frédérique E. Oggier,et al.  Coding Techniques for Repairability in Networked Distributed Storage Systems , 2013, Found. Trends Commun. Inf. Theory.

[42]  A. Cormack Representation of a Function by Its Line Integrals, with Some Radiological Applications , 1963 .

[43]  A. Kingston,et al.  Projective Transforms on Periodic Discrete Image Arrays , 2006 .

[44]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[45]  R. Bracewell Strip Integration in Radio Astronomy , 1956 .

[46]  Dominique Barba,et al.  Psychovisual image coding via an exact discrete Radon transform , 1995, Other Conferences.

[47]  Neil J. A. Sloane,et al.  The encyclopedia of integer sequences , 1995 .

[48]  Brendan Gregg,et al.  Systems Performance: Enterprise and the Cloud , 2013 .

[49]  Catherine D. Schuman,et al.  A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage , 2009, FAST.

[50]  L. Vivier,et al.  The new ext 4 filesystem : current status and future plans , 2007 .

[51]  Longin Jan Latecki,et al.  Digital Topology , 1994 .

[52]  Federica Battisti,et al.  Finite Radon coding for content delivery over hybrid client-server and P2P architecture , 2012, 2012 5th International Symposium on Communications, Control and Signal Processing.

[53]  Jérôme Lacan,et al.  FNT-Based Reed-Solomon Erasure Codes , 2009, 2010 7th IEEE Consumer Communications and Networking Conference.

[54]  Benoît Parrein,et al.  Comparison of RAID-6 Erasure Codes , 2015 .

[55]  Steve R. Kleiman,et al.  Vnodes: An Architecture for Multiple File System Types in Sun UNIX , 1986, USENIX Summer.

[56]  R. Bracewell,et al.  Aerial Smoothing in Radio Astronomy , 1954 .

[57]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[58]  Randy H. Katz,et al.  Failure correction techniques for large disk arrays , 1989, ASPLOS III.

[59]  Mario Blaum,et al.  New array codes for multiple phased burst correction , 1993, IEEE Trans. Inf. Theory.

[60]  Nicolas Normand,et al.  The Mojette Erasure Code for Distributed File Systems , 2014 .

[61]  L. Turner,et al.  Inverse of the Vandermonde matrix with applications , 1966 .

[62]  Henri Der Sarkissian,et al.  Tomographie et géométrie discrètes avec la transformée Mojette. (Tomography and discrete geometry using the Mojette transform) , 2015 .

[63]  Dominic Giampaolo,et al.  Practical File System Design with the Be File System , 1998 .

[64]  John Cook,et al.  Comparing cost and performance of replication and erasure coding , 2013, ArXiv.

[65]  Azriel Rosenfeld,et al.  Connectivity in Digital Pictures , 1970, JACM.

[66]  Jérôme Lacan,et al.  Systematic MDS erasure codes based on Vandermonde matrices , 2004, IEEE Communications Letters.

[67]  Benoît Parrein,et al.  Distributed File System Based on Erasure Coding for I/O-Intensive Applications , 2014, CLOSER.

[68]  Jehoshua Bruck,et al.  Computing in the RAIN: A Reliable Array of Independent Nodes , 2000, IPDPS Workshops.

[69]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[70]  Richard C. Singleton,et al.  Maximum distance q -nary codes , 1964, IEEE Trans. Inf. Theory.

[71]  Michael Luby,et al.  LT codes , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[72]  P. Lewis,et al.  The finite Fourier transform , 1969 .

[73]  Vincent Roca,et al.  Reed-Solomon Forward Error Correction (FEC) Schemes , 2009, RFC.

[74]  Benoît Parrein,et al.  Performance evaluation of the Mojette erasure code for fault-tolerant distributed hot data storage , 2015, ArXiv.

[75]  Darrell D. E. Long,et al.  Swift/RAID: A Distributed RAID System , 1994, Comput. Syst..

[76]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[77]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[78]  T. J. Cornwell Image Restoration and the Clean Technique , 1982 .

[79]  Michael O. Rabin,et al.  Efficient dispersal of information for security, load balancing, and fault tolerance , 1989, JACM.

[80]  Brian Warner,et al.  Tahoe: the least-authority filesystem , 2008, StorageSS '08.

[81]  Imants D. Svalbe,et al.  Erasure Coding with the Finite Radon Transform , 2010, 2010 IEEE Wireless Communication and Networking Conference.

[82]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[83]  Shekhar Chandra,et al.  Exact image representation via a number-theoretic Radon transform , 2014, IET Comput. Vis..

[84]  Benoît Parrein,et al.  Spatial Implementation for Erasure Coding by Finite Radon Transform , 2012 .

[85]  Gerard Faria,et al.  DVB-H: Digital Broadcast Services to Handheld Devices , 2006, Proceedings of the IEEE.

[86]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[87]  Van-Anh Truong,et al.  Availability in Globally Distributed Storage Systems , 2010, OSDI.

[88]  David Robinson,et al.  Network File System (NFS) version 4 Protocol , 2003, RFC.

[89]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[90]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[91]  Andrew S. Tanenbaum,et al.  Distributed systems: Principles and Paradigms , 2001 .

[92]  Vincent Roca,et al.  Improving the Decoding of LDPC Codes for the Packet Erasure Channel with a Hybrid Zyablov Iterative Decoding/Gaussian Elimination Scheme , 2008 .

[93]  Nicolas Normand,et al.  A Geometry Driven Reconstruction Algorithm for the Mojette Transform , 2006, DGCI.

[94]  Robert Tappan Morris,et al.  Ivy: a read/write peer-to-peer file system , 2002, OSDI '02.

[95]  Luigi Rizzo,et al.  Effective erasure codes for reliable computer communication protocols , 1997, CCRV.

[96]  Pierre Duhamel,et al.  Codage conjoint source/canal : Enjeux et approches , 1997 .

[97]  Jianzhong Huang,et al.  Two Efficient Partial-Updating Schemes for Erasure-Coded Storage Clusters , 2012, 2012 IEEE Seventh International Conference on Networking, Architecture, and Storage.