Just done some experiments with my dummy LZSS coder.
+ LZSS with 64K window
+ Optimal parsing (S&S in this case is really optimal)
+ Tagged code output
There are two variants of the LZ-output coding:
1. Flag based
1-byte - Defines eight flags (0-Literal, 1-Match)
followed by stored literal (1-byte) or LZ-code (3-bytes):
1-byte - Match length
2-byte - Match offset
2. Tag based
The first bit of a 1-byte tag defines:
0-Run of literals, the number of literals stored in low 7-bits
1-A match, match length stored in low 7-bits of a tag, followed by a 16-bit offset
It's pure and simple. Mainly I play with such thing due to the simplest decoder possible which may be written in pure ASM.
The second variant is much more stable on already compressed data:
A10.jpg:
1: 946,079 bytes
2: 848,320 bytes
At the same time the first variant may provide a higher compression in common:
world95.txt:
1: 968,257 bytes
2: 1,001,897 bytes
Additionally, with both cases we may increase the windows size at the cost of a smaller MAXMATCH.
Check out some testing results of the first coding variant and different window sizes:
world95.txt:
64K: 968,257 bytes
128K: 868,035 bytes
256K: 797,373 bytes
512K: 774,220 bytes
For comparison, Deflate compresses world95.txt down to 862,824 bytes. i.e. starting with 256K window we may beat Deflate even without using Huffman or other entropy coding - using just pure byte-I/O, even with no bit-I/O!
Anyway, the main benefit is really crazy decompression speed (Fastest?) and extremely small decompressor.
![]()