论文信息 - Applying Tunstall Coding in the existing SEED format for Seismographic Data

Applying Tunstall Coding in the existing SEED format for Seismographic Data

The Standard for the Exchange of Earthquake Data (SEED) is a commonly used file format for recording seismographic data. The data description language (DDL) is part of the specification of the SEED format; it is used to describe the way in which the data has been encoded in the SEED file. All SEED files contain a piece of DDL code that describes how to read the data contained in the file. In this project, we explore the best possible way of compressing seismographic data losslessly under the constraint that it must be describable in DDL. Since DDL is not a Turing-complete language, it is impossible to implement many standard compression algorithms with it. However, we show it is possible to implement a modified Tunstall code under DDL that produces files that are on average 14% smaller than the traditionally used Steim2 compression technique. We also compare our results with a standard linear predictive coding scheme (which is not implementable in DDL) and show that our technique is on average 18% worse on the same set of files.

Edwin S. Hong | Shu-Fang Newman