Some news about this project. This version may represent a new generation of my compressors.
Firstly, this version has no file size limits (64-bit file sizes)! Also switched to pure hash chaining as a result we can see some compression gain, and even speed improvement. However, now LZPM's memory usage is N + (16*N) - i.e. 17*N. Buffer size: 16 MB, so you can calculate memory usage. Note that for decompression it still uses N + 4 MB (i.e. 20 MB) of memory. In addition, many stuff optimized and renewed, added compression options like 'max optimize lookahead'. Actually, I'm thinking about how to call such parameter more accurately. This one defines how many bytes LZPM must check ahead with lazy matching. Zero disables an improved parsing, a higher numbers provide higher compression at some cost of the compression time. Decoder with all modes stays untouched. Higher numbers always work slower, but not always provide higher compression. Valid values are from 0 to 3. The best choice depends on each file - for example, for some files the '1' is the best, for others '3', etc.
Some testing results:
world95.txt
LZPM 0.07, c0: 658,403 bytes
LZPM 0.07, c1: 617,255 bytes
LZPM 0.07, c2: 603,267 bytes
LZPM 0.07, c3: 598,837 bytes
english.dic
LZPM 0.07, c0: 1,017,266 bytes
LZPM 0.07, c1: 1,015,902 bytes
LZPM 0.07, c2: 1,031,857 bytes
LZPM 0.07, c3: 1,048,941 bytes
rafale.bmp
LZPM 0.07, c0: 1,129,188 bytes
LZPM 0.07, c1: 1,102,955 bytes
LZPM 0.07, c2: 1,094,724 bytes
LZPM 0.07, c3: 1,093,084 bytes
acrord32.exe
LZPM 0.07, c0: 1,649,978 bytes
LZPM 0.07, c1: 1,634,282 bytes
LZPM 0.07, c2: 1,630,196 bytes
LZPM 0.07, c3: 1,632,103 bytes
Also I played with Flexible Parsing. Concluding, hash chains based match finder works not so well in this case - improvement over lazy matching is tiny and at the same time compression speed is greatly decreased. Brute force match finder may help, but compression takes forever. In future I will try to find more efficient match finder in terms of memory use (17*N is okay, but at some point is too heavy) or in terms of efficiency with flexible/optimal parsing.
![]()