LzxPack

Utility

Submitted by Dan Dooré on Monday, July 23, 2018 - 23:13.

Download

lzxpack01.zip

Release Year

2017

Copyrights

Public Domain

Author

Slavomír Lábsky

Description

Packing utility for targeting 8-bit micros for use on Win32 platforms.

From https://busy.speccy.cz/tvorba/pcprogs.htm

Used in Street Fighter 1 and the Music Experiment demo.

Instructions

################
## LZX Packer ##
################

Version: 01 (02.02.2017)
License: General Public License

LzxPackSuite is a compression system allowing more powerful computers
compress up to 64kB files especially for the some target 8-bit computer.

It includes following parts:

- Packer LzxPack, LzmPack, LzePack
- Depacker / lister: LzxList
- Set of decompression routines for decompress files on target computer

Typical use:

Compress a file by LzxPack, then transfer compressed file to target computer
where decompress the file using decompression routine.

Compression system is based on LZ compression. It achieve compression by replacing repeated
occurrences of data with references to a single copy of that data existing earlier
in the uncompressed data stream.

Example for illustrating the principle of LZ compression:

Let the original file looks like this:

abc12345678def12345678ghi

Compressed file will look like this:

abc12345678def<repetitive sequence length=8 offset=13>ghi

After copying of letter 'f', decompressor copies 8 bytes from distance 13 bytes back (offset).
If information about the length and offset of sequence needs (i.e.) 2 bytes, the final compressed
length is reduced by 6 bytes since the original 8-byte sequence was replaced by 2-bytes of information.

The system allows to use more different ways to encode information about length and offset
and it always be able to choose the most effective way for compression.

%%%%%%%%%%%%%
%% LzxPack %%
%%%%%%%%%%%%%

Universal LZX packer for files 64kB

- Tries set of predefined compressions and selects the best one
- For selecting the best compression, it is possible to include length of decompression routine
- User allows to select some of the used compressions
- Can generate statistical information about tested compressions for comparison

Use: LzxPack filename <options>

Options:
-s ......... show statistic of all used compressions (default: not show)
-a ......... save all compressed files (default: save only the best result)
-d<file> ... text file with depacker lengths to assume them into statistic
-tXYoAoB ... select required compression (default: trying any compression)

Option -s

View statistics of all types of tested compressions in a summary table.
Meaning of table columns:

   Compression ..... type of compression
   NumSek .......... number of sequences copied during decompression
   Packed .......... number of bytes in these copied sequences
   NoPck ........... number of uncompressed bytes what are not in any sequences
   Overhead ........ number of bytes needed to store sequence information (lengths and offsets)
   Packed length ... total length of output compressed file
   With depacker ... sum of length of file and length of decompression routine

Option -a

In general, LzxPack tests all selected compressions and then uses the best one.
But with the -a option, all compressions is really used and more output files
will be saved => one output file for each compression. Set of used compressions
can be specified by option -t (see below).

Option -d<file>

Determine the best compression with including length of decompressing routine.
To know length of decompressor, LzxPack needs read a special file containing
definitions of these lengths. An example of this file is "spd0lens".

Example usage of this file: LzxPack -dspd0lens <file_to_packing>

If you will write some new decompress routines for another platform, you can write length
of this routine to a new text file, use this file with -d option and LzxPack will be able
to select best compression for using on this platform.

Volba -tXYoAoB

Selecting required compression or set of compressions:

   X ... compression type: 0=any 1=LZM 2=LZE 3=ZX7 4=BLK 5=BS1 (see below)
   Y ... coding offset: 0=any 1=LZM 2=LZE 4=OF4 5=OF1 6=OF2 7=OFD (see below)
   A ... (optional parameter) number of bits of offset for OF4 OF1 OF2
   B ... (optional parameter) number of bits of second offset for OF2 only

Params A and B are optional, not present (only -tXY) or zero assumes any possible values.

Types of compression -tXY

Select compression X

     1 ... LZM ... Very simple but very quick compression with sequences up to 127 bytes
     2 ... LZE ... More complex but still quick compression with sequences up to 16383 bytes
     3 ... ZX7 ... Compression similar to the program known as ZX7 (author Einar Saukas)
     4 ... BLK ... Simple block compression without any limitation of sequence length
     5 ... BS1 ... Compression optimized to short sequences and long blocks of data

Offset codings Y

     1 ... LZM ... Very simple but very quick storing of offset 1..255 into one byte
     2 ... LZE ... More complex but still quick storing of offset 1..32767 into max 2 bytes
     4 ... OF4 ... Four offsets of different bit width (A = bit width of the shortest offset)
     5 ... OF1 ... One offset of fixed bit width (A = bit width of offset)
     6 ... OF2 ... Two offsets of fixed bit width (A = width of the 1st, B = width of the 2nd offset)
     7 ... OFD ... Variable bit width offset without any limitations (no need to enter A and B)

Ranges for OF1, OF2, OF4 offsets are given by their bit width.

The output compressed file will have ".lzx" extension and his name will be extended
by used compression specification in this form: "-tXYoAoB" (the same as option -t).
If the option -a is used, filenames will differ in this compression specification.

%%%%%%%%%%%%%%%%%%%%%%%%%
%% LzmPack and LzePack %%
%%%%%%%%%%%%%%%%%%%%%%%%%

Single-purpose LZX packers for LZM and LZE compressions optimized for speed.
LzePack is able to run about 10 times faster than LzxPack and LzmPack can run usually up to 300 times faster.

Use:

LzmPack <file>
LzePack <file>

No options are here. The only parameter is file to compress.
Compressed file will have the same name with extension "lzm" alebo "lze".

LzmPack uses compression LZM what is the same as in LzxPack with option -t11 and
LzePack uses compression LZE what is the same as in LzxPack with option -t22.

%%%%%%%%%%%%%
%% LzxList %%
%%%%%%%%%%%%%

Universal lister and decompressor for files compressed with LzxPack / LzmPack / LzePack

- Automatically detects used type of compression due to extension or filename part -tXYoAoB
- Displays of "Pack model" - a structure of compressed file and its brief statistic
- Allows to specify multiple files at a time to decompress and displaying their pack model

Use: LzxList <options> file1 file2 file3...

Options:
-l ........ write listing of pack model of file(s) to stdout (default functionality)
-d ........ depack and write to output file with extension 'out' (or set by the -e)
-u ........ the same as -d but with removing compress info -tXXoAoB from output filename
-e<ext> ... set extension for output files (default extension is 'out')

You can use wildcards in filenames to list or depack more files together.

Compression type detecting:

   Extension "lzm" ... compression type LZM, the same as -t11
   Extension "lze" ... compression type LZE, the same as -t22
   Other extension ... type of compression is taken from filename part -tXYoAoB

LzxList can be used to decompress files or displaying structure of used compression.
LzxList also checks the data integrity of compressed file, especially is the sequences
will not be copied from area outside of valid data during decompression.

%%%%%%%%%%%%%%
%% DecLzx01 %%
%%%%%%%%%%%%%%

Universal LZX decompression routine for Z80

Supported types of compression X

     3 ... ZX7 ... compression similar to a program known as ZX7 (author Einar Saukas)
     4 ... BLK ... Simple block compression without any limitation of sequence length
     5 ... BS1 ... Compression optimized to short sequences and long blocks of data

Supported offset codings Y

     4 ... OF4 ... Four offsets of different bit width (A = bit width of the shortest offset)
     5 ... OF1 ... One offset of fixed bit width (A = bit width of offset)
     6 ... OF2 ... Two offsets of fixed bit width (A = width of the 1st, B = width of the 2nd offset)
     7 ... OFD ... Offset with variable bit width and without any limitation (no need to enter A and B)

Before compilation of this routine, it is needed to set parameters com pos ofs1 ofs2
of used compression according to part -tXYoAoB of compressed file name:

    com = X
    pos = Y
   ofs1 = A
   ofs2 = B

The parameters A and B are required only if they present in compression type -tXY.

Routine allows to optimize the code for the length or speed
by setting of the parameters 'spd' by this way:

    spd = 0 ... optimization for shortest code
    spd = 1 ... compromise between code length and speed
    spd = 2 ... optimization for high speed

Routine code length depends on:

- the used compression -tXY ...
- optimization settings 'spd'

by this way:

spd=0      X=> 3(ZX7)   4(BLK)   5(BS1)
   Y   ~~~~~~   ~~~~~~   ~~~~~~
   4 (OF4)      79      81   103
   5 (OF1)      65      67      89
   6 (OF2)      80      82   104
   7 (OFD)      61      63      88

spd=1      X=> 3(ZX7)   4(BLK)   5(BS1)
   Y   ~~~~~~   ~~~~~~   ~~~~~~
   4 (OF4)      83      85   107
   5 (OF1)      69      71      93
   6 (OF2)      84      86   108
   7 (OFD)      65      67      92

spd=2      X=> 3(ZX7)   4(BLK)   5(BS1)
   Y   ~~~~~~   ~~~~~~   ~~~~~~
   4 (OF4)      87      89   114
   5 (OF1)      73      75   100
   6 (OF2)      89      91   116
   7 (OFD)      69      71      99

These lengths are defined in files

   spd0lens ... valid for spd = 0
   spd1lens ... valid for spd = 1
   spd2lens ... valid for spd = 2

These files can be used with the -d option in LzxPack. In this case, LzxPack can select
the best compression that gives the smallest length including the length of decompression routine.

Note: this routine does not support "fast" compressions LZM and LZE (-t11 or -t22)
In case of decompressing LZM / LZE compressed files, please use optimized DecLzm01 or DecLze01.

Important note: All decompression routines DecLzm DecLze DecLzx do not solve the overlapping data!

Data is decompressed always from lower to higher addresses, so source compressed data can be loaded
above destination decompressed area. During decompression, old used compressed data can be safely
overwritten by new decompressed data, but it is needed to keep distance between end of compressed and
decompressed areas. For the compressions LZM,LZE,BLK,BS1 8 bytes is enough, for the compression ZX7
the safety distance is 1/8 of the total length of compressed data (but usually less).

%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% DecLzm01 and DecLze01 %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%

Single-purpose LZX decompression Z80 routines optimized for quick LZM and LZE compressions.

DecLzm decompresses files compressed by LzmPack (or LzxPack with option -t11)
DecLze decompresses files compressed by LzePack (or LzxPack with option -t22)

DecLzm is (it seems) the shortest known LZ decompression ruotine, it has only 26 bytes.
DecLze is not so longer, it has 48 bytes.
For both routines, all data are copied by LDIR, so they are very fast.

%%%%%%%%%%%%%%%%%%%%%%%%%
%% Compression formats %%
%%%%%%%%%%%%%%%%%%%%%%%%%

Common format of compressed file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LZM:        <block> <block> <block> ... <block> <end_mark>
LZE:        <block> <block> <block> ... <block> <end_mark>
BLK:        <block> <block> <block> ... <block> <end_mark>
ZX7: <byte> <block> <block> <block> ... <block> <end_mark>
BS1: <byte> <block> <block> <block> ... <block> <end_mark>

File compressed by ZX7 and BS1 begins with one unpacked data byte allways.

Each <block> can contain uncompressed data or copying sequence.
Uncompressed Data: <id> <data..>
Copying sequence: <id> <offset>

<Id> is identification of block and its format depends on type of compression.
This identification includes the number of bytes of uncompressed data or length of sequence.

<Offset> means on how many bytes will be returned to the tial able to copy sequence.
Format depends on offset coding.

LZM and LZE are "byte-oriented" compressions, is means that <Id> and <offset> takes whole bytes.
Due to this they allow very fast decompressing and very short decompression routines.

BLK, ZX7, BS1 and OF1, OF2, OF4, OFD are "bitstream" compressions and offset codings.
The <id> and <offset> can use a part of byte only, or a needed number of bits.
They are stored in bitstream, a separate stream of bytes of data what are reading bit-by-bit,
always how many bits are currently needed.
So decompression is slower and decompression routines have longer code length,
but storing of information is more effective and compression ratio is better.

The uncompressed data <Data ..> is not part of the bitstream, it is stored in the next bytes.

Format <id>
~~~~~~~~~~~

LZM (always one byte)

Bit 0 ..... 0 = uncompressed data, 1 = sequence
bit 1-7 ... number of next bytes of uncompressed data or sequence length

Zero length in bits 1-7 means end mark.

LZE (one or two bytes)

Bit 6 ..... 0 = uncompressed data, 1 = sequence
Bit 7 ..... 0 = 6-bit length, 1 = 14-bit length of data or sequence
bit 0-5 ... 6-bit length or high byte of 14-bit length of data or sequence

In case of 14-bit length there are one additional byte, it is low byte of length.
Zero-length of uncompressed data means end mark.

BLK (bitstream)

<Length> 1 ... uncompressed data
<Length> 0 ... copying sequence (length is incremented by 1 before copying)

Length is encoded in Elias-Gamma coding.
Length more than 65535 means end mark.

BLK is very simple compression, and in conjunction with Elias-Gamma offset coding
allows to reach very short decompression routine with good compression ratio.

ZX7 (bitstream)

1 ............ one uncompressed byte
0 <length> ... copying sequence (length is incremented by 1 before copying)

Length is encoded in Elias-Gamma coding.
Length more than 65535 means end mark.

<Block> with uncompressed content can keep only one byte, so there is needed separate <block>
for each byte of uncompressed data. It means that all bytes from source file, what are not
compressed in sequences, takes 9 bits in compressed file.

This compression is very similar to known compress program ZX7.
However, since offset coding in Lzx is different than offset coding in ZX7,
decompression routine from ZX7 is unusable for decompressing files compressed by LzxPack.

BS1 (bitstream)

0 ................... one uncompressed byte
10 .................. 2 bytes length sequence
110 ................. 3 bytes length sequence
1110 <length> ....... 4+ bytes length sequence
1111 <length> ....... 8+ bytes of uncompressed data
111100000000000001 .. end mark (13 zero bites = length overflows 65535)

Length coding (slightly modified Elias-Gamma coding)

<n> 1 <n + 2> ... 4+ bytes length sequence
<n> 1 <n + 3> ... 8+ bytes of uncompressed data

Where <n> is the number of zero bits followed by bit 1 and then next n + 2 or n + 3 bits of value.
13 zero bits (in case of uncompressed data) means that length overflows 65535, so this is end mark.

This compression is designated to effective storing of most frequently occured
short 2 and 3 bytes length sequences and storing of large blocks of uncompressed data.

Format <offset>
~~~~~~~~~~~~~~~

LZM one-byte offset

<Value> ... the offset value

The offset value is directly value of this byte.
Offset range: 1..255. Value 0 has no meaning, so it is not valid (LzxList can check it).

LZE = one or two-byte offset

   First byte
     bit 7 ..... offset width: 0 = 7-bit offset 1..127, 1 = 15-bit offset 1..32767
     bit 0-6 ... value of 7-bit offset or high byte of 15-bit offset

The second byte (only in case of 15-bit offset)
bit 0-7 ... Low byte of 15-bit offset

The offset value 0 has no meaning, so it is valid (LzxList can check it).

OFD = offset with full variable bit width

<Value> ... offset value in Elias-Gamma coding

It enables storing of any offset, with emphasis on effectively
storing of more frequently occurring of small offsets.

OF1 = offset with one fixed bit width

<n> ... n bits offset value decremented by 1

The number 'n' refers to the parameter 'A' in the compression type -tXYoAoB

OF2 = two different offsets with fixed bit width

1 <n1> .... shorter offset with fixed width 'n1' bits
0 <n2> .... longer offset with fixed width 'n2' bits

Number 'n1' meets the param 'A' and 'n2' meed the param 'B' in the specification type -tXYoAoB

Ranges of offsets:
Shorter offset: 1 .. 2^A Example for A = 3: 1 .. 8
Longer offset: 2^A+1 .. 2^A+2^B Example for B = 5: 9 .. 40

OF4 = four offsets with uniformly scaled bit width

00 <n> ..... shortest offset with 'n' bits width
01 <2*n> ... longer offset with 2*n bits width
10 <3*n> ... even longer with 3*n bits width
11 <4*n> ... longest offset with 4*n bits width

The number 'n' refers to the parameter 'A' in the compression type -tXYoAoB

Offsets ranges are determined by the same way as in OF2,
for covering maximum range of values.

Example for A = 2: Offsets will be:

00 ... followed by offset 2 bits width ... covers range 1..4
01 ... followed by offset 4 bits width ... covers range 5..20
10 ... followed by offset 6 bits width ... covers range 21..84
11 ... followed by offset 8 bits width ... covers range 85..340

Elias gamma coding
~~~~~~~~~~~~~~~~~~
<N> 1 <n>

Where <n> is the number of zero bits followed by bit 1 and then next 'n' bits of value.

Example:

... 000011010 ...

represents value 26. The 4 zero bits at the begin determines that there are next 4 bits after bit 1.
The bit 1 with next 4 bits determine requested 5-bits value 26.

%%%%%%%%%%%
%% Thanx %%
%%%%%%%%%%%

Big thanks for the inspiration goes to:

- RM-TEAM for its compression utility QUIDO
- Einar Saukas for his compress program ZX7