Interval-based approach to lexicograhpic representation and compression of numeric data
暂无分享,去创建一个
This paper proposes a new method of encoding numbers by variable-length byte-strings. The primary property of the encoding is that the lexicographic comparison of the encoded numbers corresponds correctly to the order of the real numbers. The encoding is space-efficient. Further, unlike the fixed-length representations of numbers (fixed-point, floating-point, etc.,) the encoded numbers are not limited in their magnitude or the number of their significant digits. The paper also elaborates the application of the encoding method to the storage of numeric data in databases. The proposed application for databases is a uniform format for all the numbers, regardless of their types and attributes (fields). All the numbers are represented in a form of lexicographically-comparable byte-strings. This form simplifies the data management software (only one format to deal with at the physical database level) and hardware (when associative memory and storage devices etc. are used); makes the applications more flexible (by removing limitations on the sizes of numbers); and is space-efficient for all numbers while being especially concise for those numbers that are used more frequently in databases.
[1] Peter Kornerup,et al. An order preserving finite binary encoding of the rationals , 1983, 1983 IEEE 6th Symposium on Computer Arithmetic (ARITH).
[2] Naphtali Rishe,et al. Database design: the semantic modeling approach , 1992 .
[3] Gerald Salton,et al. Automatic text processing , 1988 .
[4] Naphtali Rishe,et al. A file structure for semantic databases , 1991, Inf. Syst..
[5] Naphtali Rishe,et al. Architecture for a massively parallel database machine , 1989 .