Constrained de Bruijn Codes: Properties, Enumeration, Constructions, and Applications

The de Bruijn graph, its sequences, and their various generalizations, have found many applications in information theory, including many new ones in the last decade. In this paper, motivated by a coding problem for emerging memory technologies, a set of sequences which generalize sequences in the de Bruijn graph are defined. These sequences can be also defined and viewed as constrained sequences. Hence, they will be called constrained de Bruijn sequences and a set of such sequences will be called a constrained de Bruijn code. Several properties and alternative definitions for such codes are examined and they are analyzed as generalized sequences in the de Bruijn graph (and its generalization) and as constrained sequences. Various enumeration techniques are used to compute the total number of sequences for any given set of parameters. A construction method of such codes from the theory of shift-register sequences is proposed. Finally, we show how these constrained de Bruijn sequences and codes can be applied in constructions of codes for correcting synchronization errors in the $\ell$-symbol read channel and in the racetrack memory channel. For this purpose, these codes are superior in their size on previously known codes.

[1]  Tuvi Etzion,et al.  Constructions for perfect maps and pseudorandom arrays , 1988, IEEE Trans. Inf. Theory.

[2]  Guang Gong,et al.  Signal Design for Good Correlation: For Wireless Communication, Cryptography, and Radar , 2005 .

[3]  Richard A. Games,et al.  A fast algorithm for determining the complexity of a binary sequence with period 2n , 1983, IEEE Trans. Inf. Theory.

[4]  C. F. Osborne,et al.  A digital watermark , 1994, Proceedings of 1st International Conference on Image Processing.

[5]  M. Waterman,et al.  Estimating the repeat structure and length of DNA sequences using L-tuples. , 2003, Genome research.

[6]  Frederic Sala,et al.  Exact Reconstruction From Insertions in Synchronization Codes , 2016, IEEE Transactions on Information Theory.

[7]  H. Fredricksen A Survey of Full Length Nonlinear Shift Register Cycle Algorithms , 1982 .

[8]  Tuvi Etzion,et al.  Construction of de Bruijn sequences of minimal complexity , 1984, IEEE Trans. Inf. Theory.

[9]  Adriaan J. de Lind van Wijngaarden,et al.  Construction of Maximum Run-Length Limited Codes Using Sequence Replacement Techniques , 2010, IEEE Journal on Selected Areas in Communications.

[10]  Abraham Lempel,et al.  On a Homomorphism of the de Bruijn Graph and its Applications to the Design of Feedback Shift Registers , 1970, IEEE Transactions on Computers.

[11]  Han Mao Kiah,et al.  Codes for DNA sequence profiles , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[12]  Noga Alon,et al.  Linear time erasure codes with nearly optimal recovery , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[13]  Wenqing Wu,et al.  Cross-layer racetrack memory design for ultra high density and low power consumption , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[14]  Abraham Lempel,et al.  Cryptology in Transition , 1979, CSUR.

[15]  de Ng Dick Bruijn A combinatorial problem , 1946 .

[16]  Paul H. Siegel,et al.  Constructions and Decoding of Cyclic Codes Over $b$ -Symbol Read Channels , 2016, IEEE Transactions on Information Theory.

[17]  Richard A. Games,et al.  On the Complexities of de Bruijn Sequences , 1982, J. Comb. Theory, Ser. A.

[18]  Kenneth G. Paterson,et al.  Properties of the Error Linear Complexity Spectrum , 2009, IEEE Transactions on Information Theory.

[19]  Abraham Lempel,et al.  Design of universal test sequences for VLSI , 1985, IEEE Trans. Inf. Theory.

[20]  Michael S. Waterman,et al.  A New Algorithm for DNA Sequence Assembly , 1995, J. Comput. Biol..

[21]  Lara Dolecek,et al.  Repetition Error Correcting Sets: Explicit Constructions and Prefixing Methods , 2009, SIAM J. Discret. Math..

[22]  Cauligi S. Raghavendra,et al.  Rearrangeability of multistage shuffle/exchange networks , 1987, ISCA '87.

[23]  Mark J. P. Chaisson,et al.  De novo fragment assembly with short mate-paired reads: Does the read length matter? , 2009, Genome research.

[24]  Yu Zhang,et al.  An Eulerian Path Approach to Global Multiple Alignment for DNA Sequences , 2003, J. Comput. Biol..

[25]  S. Parkin,et al.  Magnetic Domain-Wall Racetrack Memory , 2008, Science.

[26]  A. Robert Calderbank,et al.  Correcting Two Deletions and Insertions in Racetrack Memory , 2017, ArXiv.

[27]  Han Mao Kiah,et al.  Rates of DNA Sequence Profiles for Practical Values of Read Lengths , 2016, IEEE Transactions on Information Theory.

[28]  Pavel A Pevzner,et al.  How to apply de Bruijn graphs to genome assembly. , 2011, Nature biotechnology.

[29]  Hendrik C. Ferreira,et al.  On multiple insertion/Deletion correcting codes , 2002, IEEE Trans. Inf. Theory.

[30]  Cengizhan Ozturk,et al.  Structured Light Using Pseudorandom Codes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  V. Benes Optimal rearrangeable multistage connecting networks , 1964 .

[32]  Venkatesan Guruswami,et al.  Linear-time encodable/decodable codes with near-optimal rate , 2005, IEEE Transactions on Information Theory.

[33]  Olgica Milenkovic,et al.  Unique Reconstruction of Coded Sequences from Multiset Substring Spectra , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[34]  Joaquim Salvi,et al.  A state of the art in structured light patterns for surface profilometry , 2010, Pattern Recognit..

[35]  Yi-Chih Hsieh Decoding structured light patterns for three-dimensional imaging systems , 2001, Pattern Recognit..

[36]  Tuvi Etzion,et al.  An Efficient Algorithm for Generating Linear Transformations in a Shuffle-Exchange Network , 1986, SIAM J. Comput..

[37]  Yeow Meng Chee,et al.  Maximum Distance Separable Codes for Symbol-Pair Read Channels , 2012, IEEE Transactions on Information Theory.

[38]  Haixu Tang,et al.  A new approach to fragment assembly in DNA sequencing , 2001, RECOMB.

[39]  Alexander Vardy,et al.  Asymptotically optimal sticky-insertion-correcting codes with efficient encoding and decoding , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[40]  Christophe Collewet,et al.  Optimised De Bruijn patterns for one-shot shape acquisition , 2005, Image Vis. Comput..

[41]  Yu Wang,et al.  Hi-fi playback: Tolerating position errors in shift operations of racetrack memory , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[42]  Alfred M. Bruckstein,et al.  Simple and Robust Binary Self-Location Patterns , 2011, IEEE Transactions on Information Theory.

[43]  Eitan Yaakobi,et al.  Codes correcting position errors in racetrack memories , 2017, 2017 IEEE Information Theory Workshop (ITW).

[44]  F. MacWilliams,et al.  Pseudo-random sequences and arrays , 1976, Proceedings of the IEEE.

[45]  Khaled A. S. Abdel-Ghaffar,et al.  Codes for correcting three or more adjacent deletions or insertions , 2014, 2014 IEEE International Symposium on Information Theory.

[46]  Ueli Maurer,et al.  Asymptotically-Tight Bounds on the Number of Cycles in Generalized de Bruijn-Good Graphs , 1992, Discret. Appl. Math..

[47]  de Ng Dick Bruijn,et al.  Circuits and Trees in Oriented Linear Graphs , 1951 .

[48]  Eitan Yaakobi,et al.  Codes Correcting a Burst of Deletions or Insertions , 2016, IEEE Transactions on Information Theory.

[49]  Harold S. Stone,et al.  Parallel Processing with the Perfect Shuffle , 1971, IEEE Transactions on Computers.

[50]  I. Good Normal Recurring Decimals , 1946 .

[51]  Salim El Rouayheb,et al.  Correcting bursty and localized deletions using guess & check codes , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[52]  Thomas M. Cover,et al.  Enumerative source encoding , 1973, IEEE Trans. Inf. Theory.

[53]  Jehoshua Bruck,et al.  Duplication-correcting codes for data storage in the DNA of living organisms , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[54]  Dhiraj K. Pradhan,et al.  The De Bruijn Multiprocessor Network: A Versatile Parallel Processing and Sorting Network for VLSI , 1989, IEEE Trans. Computers.

[55]  Tuvi Etzion,et al.  Algorithms for the generation of full-length shift-register sequences , 1984, IEEE Trans. Inf. Theory.

[56]  Mario Blaum,et al.  Codes for Symbol-Pair Read Channels , 2010, IEEE Transactions on Information Theory.

[57]  Arnold L. Rosenberg,et al.  Exhaustive Generation of Bit Patterns with Applications to VLSI Self-Testing , 1983, IEEE Transactions on Computers.

[58]  Venkatesan Guruswami,et al.  Efficient Low-Redundancy Codes for Correcting Multiple Deletions , 2015, IEEE Transactions on Information Theory.