Encoding technique for high data compaction in data bases of fusion devices

At present, data requirements of hundreds of Mbytes/discharge are typical in devices such as JET, TFTR, DIII‐D, etc., and these requirements continue to increase. With these rates, the amount of storage required to maintain discharge information is enormous. Compaction techniques are now essential to reduce storage. However, general compression techniques may distort signals, but this is undesirable for fusion diagnostics. We have developed a general technique for data compression which is described here. The technique, which is based on delta compression, does not require an examination of the data as in delayed methods. Delta values are compacted according to general encoding forms which satisfy a prefix code property and which are defined prior to data capture. Several prefix codes, which are bit oriented and which have variable code lengths, have been developed. These encoding methods are independent of the signal analog characteristics and enable one to store undistorted signals. The technique has been applied to databases of the TJ‐I tokamak and the TJ‐IU torsatron. Compaction rates of over 80% with negligible computational effort were achieved. Computer programs were written in ANSI C, thus ensuring portability and easy maintenance. We also present an interpretation, based on information theory, of the high compression rates achieved without signal distortion.