Byte Structure Variable Length Coding (BS-VLC): A New Specific Algorithm Applied in the Compression of Trajectories Generated by Molecular Dynamics

Molecular dynamics is a well-known technique very much used in the study of biomolecular systems. The trajectory files produced by molecular dynamics simulations are extensive, and the classical lossless algorithms give poor efficiencies in their compression. In this work, a new specific algorithm, named byte structure variable length coding (BS-VLC), is introduced. Trajectory files, obtained by molecular dynamics applied to trypsin and a trypsin:pancreatic trypsin inhibitor complex, were compressed using four classical lossless algorithms (Huffman, adaptive Huffman, LZW, and LZ77) as well as the BS-VLC algorithm. The results obtained show that BS-VLC nearly triplicates the compression efficiency of the best classical lossless algorithm, preserving a near lossless behavior. Compression efficiencies close to 50% can be obtained with a high degree of precision, and the maximum efficiency possible (75%), within this algorithm, can be performed with good precision.