Recently I implemented LZF decompressors for Z80 (my main target platform is ZX Spectrum, but the code should be generic enough to work anywhere).
To achieve higher decompression speed the end markers decribed by encode are implied, so the compression must be done using LZF v.1.03.
The decompressor optimized for size is only 56 bytes long and decompresses data at approximately 40.6 t-states per uncompressed byte:
Code:
;
; Size-optimized LZF decompressor by spke (v.1 21-24/08/2018, 56 bytes)
;
; The data must be comressed using the nearly optimal LZF command line compressor
; (c) 2013-2018 Ilya Muravyov (aka encode); the command line is:
;
; lzf.exe cx <sourcefile> <outfile>
;
; where option cx gives you the best possible compression.
;
; The ver.1.03 binary can be downloaded from
; https://encode.su/threads/1819-LZF-Optimized-LZF-compressor?p=57818&viewfull=1#post57818
; (please note that versions prior to ver.1.03 have incompatible format with the unpacker)
;
; The decompression is done in the standard way:
;
; ld hl,CompressedData
; ld de,WhereToDecompress
; call DecompressLZF
;
; Of course, LZF compression algorithm is (c) 2000-2008 Marc Alexander Lehmann;
; see http://oldhome.schmorp.de/marc/liblzf.html
;
; Drop me an email if you have any comments/ideas/suggestions: zxintrospec@gmail.com
;
@DecompressLZF: ld b,0 : jr MainLoop ; all copying is done by LDIR; B needs to be zero
ProcessMatches: exa
ld a,(hl) : inc hl
rlca : rlca : rlca : inc a
and %00000111 : jr nz,CopyingMatch
LongMatch: ld a,(hl) : add 8 : inc hl ; len == 9 means an extra len byte needs to be read
jr nc,CopyingMatch : inc b
CopyingMatch: ld c,a : inc bc : exa : jr nz,NotTheEnd ; token == #20 suggests a possibility of the end marker (#20,#00)
xor a : cp (hl) : ret z ; is it the end marker? return if it is
NotTheEnd: and %00011111 ; A' = high(offset); also, reset flag C for SBC below
push hl : ld l,(hl) : ld h,a ; HL = offset
push de : ex de,hl ; DE = offset, HL = dest
sbc hl,de ; HL = dest-offset
pop de
ldir
pop hl : inc hl
MainLoop: ld a,(hl) : cp #20 : jr nc,ProcessMatches ; tokens "000lllll" mean "copy lllll+1 literals"
ProcessLiterals: inc a : ld c,a : inc hl : ldir ; actual copying of the literals
jr MainLoop
The decompressor optimized for speed is 86 bytes long and decompresses data at approximately 37.6 t-states per uncompressed byte:
Code:
;
; Speed-optimized LZF decompressor by spke (v.1 21-29/08/2018, 86 bytes)
;
; The data must be comressed using the nearly optimal LZF command line compressor
; (c) 2013-2018 Ilya Muravyov (aka encode); the command line is:
;
; lzf.exe cx <sourcefile> <outfile>
;
; where option cx gives you the best possible compression.
;
; The ver.1.03 binary can be downloaded from
; https://encode.su/threads/1819-LZF-Optimized-LZF-compressor?p=57818&viewfull=1#post57818
; (please note that versions prior to ver.1.03 have incompatible format with the unpacker)
;
; The decompression is done in the standard way:
;
; ld hl,CompressedData
; ld de,WhereToDecompress
; call DecompressLZF
;
; Of course, LZF compression algorithm is (c) 2000-2008 Marc Alexander Lehmann;
; see http://oldhome.schmorp.de/marc/liblzf.html
;
; Drop me an email if you have any comments/ideas/suggestions: zxintrospec@gmail.com
;
@DecompressLZF: ld b,0 : jr MainLoop
Token20: inc hl : ld a,(hl) : or a : ret z ; token #20 is shadowing as the first byte of the end marker
ld c,3
push hl : ld l,a : ld h,b
jp CopyingMatch2
ProcessMatches: jr z,Token20
add #20 : jr c,LongMatch
ShortMatch: rlca : rlca : rlca
and %00000111
inc a : ld c,a ; BC = len
ld a,(hl) : inc hl
and %00011111 ; A = high(offset)
CopyingMatch: push hl : ld l,(hl) : ld h,a ; HL = offset
CopyingMatch2: push de : ex de,hl ; DE = offset, HL = dest
sbc hl,de ; HL = dest-offset
pop de
ldir
pop hl : inc hl
MainLoop: ld a,(hl) : cp #20 : jr nc,ProcessMatches
ProcessLiterals: inc a : ld c,a : inc hl ; tokens "000lllll" mean "copy lllll+1 literals"
ldir
ld a,(hl) : cp #20 : jr c,ProcessLiterals
jr z,Token20
add #20 : jp nc,ShortMatch
LongMatch: or a : exa : inc hl
ld a,(hl) : inc hl
add 9 : ld c,a : jr c,IncrementB
exa : jp CopyingMatch
IncrementB: inc b : exa : jp CopyingMatch
I expected these decompressors to have speeds comparable to LZ4. However, it turned out that beating LZ4 for this somewhat more complex data format is unlikely to be possible. The best one can do with LZ4 is approximately 33 t-states per decompressed byte. At the same time, given higher compression ratios achieved by LZF, the fast decompressor is still likely to sit on the Pareto frontier for decompression speed vs compression ratio.