Viewing Compression Ratio Statistics

In this section:

Hyperstage provides specific statistics on table and column compression. The compression ratio is calculated in relation to the natural size of uncompressed data in the table or column. The ratio equal to n means that the compressed data, including statistics and technical description of a column, is n times smaller than its theoretical natural size.

The following natural sizes (in bytes) are defined for various data types. Note the following:

The natural size of the data type is approximately equal to the binary import/export format.

Data Types and Natural Sizes

Data Type

Natural Size (in bytes)

CHAR(n), BINARY(n)

n*(number of rows)

BIGINT, INT, MEDIUMINT, SMALLINT, TINYINT, BOOL

(8 or 4 or 3 or 2 or 1 or 1)*(number of rows)

YEAR

4*(number of rows)

DATE

10*(number of rows)

TIME

8*(number of rows)

TIMESTAMP/DATETIME

19*(number of rows)

DEC(x,y)

(x+1)*(number of rows)

FLOAT

4*(number of rows)

REAL,DOUBLE

8*(number of rows)

VARCHAR(n), VARBINARY(n)

Total number of bytes used. For example, the total length of all strings, excluding terminating characters + 2*(number of rows).



x
Comparison of Calculated Compression Ratio to Physical Size

The compression ratio calculated above will differ from the compression ratio calculated from physical sizes of files on disk. The compression ratio based on physical size will be slightly smaller, due to extra files that are generated containing statistics on the imported data, such as Knowledge Nodes. Knowledge Nodes are used to optimize query execution and are discussed further in About the Knowledge Grid.


WebFOCUS