Specially for LovePimple - LZPM 0.10-luv-pimp:
lzpm-0.10-luv-pimp.zip
Again, only for testing not outside this forum.
This version has:
32 MB buffer, 8K index and has no EXE-filter.
Happy testing!
Of course, I'm waiting for response!![]()
Specially for LovePimple - LZPM 0.10-luv-pimp:
lzpm-0.10-luv-pimp.zip
Again, only for testing not outside this forum.
This version has:
32 MB buffer, 8K index and has no EXE-filter.
Happy testing!
Of course, I'm waiting for response!![]()
The power of ROLZ in its fast modes! Take a look!![]()
Thanks Ilia!Originally Posted by encode
![]()
Good speed at lowest settings, but compression is still good.
I will run further tests at higher settings.![]()
Quick test with World95.txt...
Test was timed with AcuTimer v1.0.
(1) 633 KB (649,070 bytes)
Elapsed Time: 00:00:04.12 (4.12 Seconds)
(4) 578 KB (592,292 bytes)
Elapsed Time: 00:00:15.22 (15.22 Seconds)
(6) 574 KB (588,340 bytes)
Elapsed Time: 00:00:18.43 (18.43 Seconds)
(572 KB (585,956 bytes)
Elapsed Time: 00:00:21.74 (21.74 Seconds)
(9) 560 KB (573,548 bytes)
Elapsed Time: 00:00:44.97 (44.97 Seconds)
I would love it if you could add a super-fast compression setting (-0) as well.![]()
Possibly, adding something faster will produce too redundant compressed stream. Maybe "1" is OK - on your machine it works >10X times faster than "9"! Additional conditioning, will degrade the performance of the higher modes. So, current lower modes is just cut-downed highest mode.
Consider following compression level info:
1 - Fast. Uses greedy parsing. However, it checks for all indexes.
2..8 - Normal mode variants. I think the "3" is the most efficient. To get a higher compression without loosing too much time - use the "8".
9 - Max. Max is max - to crunch the last byte possible.
Note that with all modes the decoder is the same - we have just one decoder, and encoder with various settings.
![]()
OK!Its pretty cool as it stands!
![]()
Very Good improvementes! LZPM monster!
The more I test it, the more I like this version of LZPM.
Good work, Ilia!![]()
Take into accout that I will speed the whole thing up!
I already tested an efficient MOD hash function for 5-byte strings. This hash function has some interesting properties - for example the generated hash values are far more optimal in terms of cache misses. I beleive that I can get 1.5X...2X speed-up, especially with text files. As a result the fast mode will be even faster while higher modes will be significant faster.![]()
Awesome!![]()
Tested the improvement - LZPM becomes like a rocket!!
With "1" level and 12-bit index the performance is awesome!
Even with 13-bit index LZPM compresses like turbo charged!
The disadvantage is a small loss in compression. However, at some point such speedup is worth to loose a few bytes. After some testing I'll release another "test" pre-release!![]()
Anyway, with such string searching I can use even 16K index...![]()
Cant wait!Originally Posted by encode
![]()
try using fnv- 1 hash in max mode. it also has good cache behaviour, ie. if the strings differs in only last (or first, depending on how you use this hash) byte then fnv- 1 hash also differs only in last byte. it takes longer to compute but has much better bit dispersion that modulus hash, so compression should be even higher than with old multiplicative hash.
Some time ago I tested FNV hash and stayed unhappy. I will retest the FNV anyway. The best thing I've found so far is multiplicative hash. However, dealing with 64-bit keys (5-byte string is a cut-downed 64-bit integer) we must use too long PRIME to multiply to. So I choose the MOD hash - I've found it's good enough, also it can fit in a single macro!
Since I speedup the search I tried to use the 16K index (again slowdown). The speed in this case is equal to the LZPM 0.10 in average. However, on small files it is faster, but on large file it's slower.
Results for LZPM with 32 MB buffer and 16K index:
ENWIK8: 26,761,881 bytes
ENWIK9: 232,506,555 bytes
Finally, LZPM beats LZTURBO! So, currently I'll keep such settings as a base.
Impressive compression! The time for encoding?
Not timed - just checked for compression. The most important is FAST decompression! Speed on ENWIKs must be slightly slower than current LZPM 0.10. Anyway, soon, you can check it for yourself!Originally Posted by Nania Francesco Antonio
![]()
Originally Posted by encode
I like it!Originally Posted by encode
![]()
What's about test version with 64 MB buffer and 16k index (of course only for fans)?
![]()
Tested various PRIME numbers. The nearest PRIME to the power of two is not the best. Played with different values and get something in Matts style:
8345621
Why only 64 MB? The buffer size is limited only by my fantasy and hardware. To be honest, only the memory usage stops me.Originally Posted by Squxe
For HEAD we need 64 MB, regardless of a buffer size. (The new hash behavior)
For PREV we need BUFFER_SIZE * 8: i.e.:
32 * 8 = 256 (+ 64 = 320 MB)
64 * 8 = 512 (+ 64 = 576 MB)
128 * 8 = 1024 (+ 64 = 1088 MB)
320 MB is OK because you need only a 512 MB RAM installed. With 64 MB buffer the user must have at least 1 GB of RAM.
Memory needed for decompression:
BUFFER_SIZE + 16 MB - i.e.:
With 32 MB buffer: 48 MB
With 64 MB buffer: 80 MB
I just consider that ROLZ2 from WinRK uses a larger memory sizes even dealing with fast modes. So, with ROLZ such small buffer size/index size limits its power. Thats why I decide to enlarge memory requirements. Thanks to a new Match Finder (5-byte string searching). With older one I just cant do that!
Results with ENWIK8 (different buffer sizes):
LZPM 32 MB: 26,739,440 bytes
LZPM 64 MB: 26,525,306 bytes
LZPM 128 MB: 26,281,978 bytes
By the way, if you recall the old RK, 16K index (the table size) is the max value!
You can VOTE for buffer/index size anyway!
P.S.
As a bonus, you can download this prime number generator, written by some dude. (I used it to get the HASHSIZE)
Magic!
Excellent idea!Originally Posted by encode
![]()
OK, some timings with ENWIK8.
LZPM (16K, 32 MB) with an old hash function and string searching:
LZPM 16K: 26,714,432 bytes, 569 sec
Too slow!!
LZPM with new hash function and string searching:
LZPM 16K: 26,739,440 bytes, 245 sec
LZPM 8K: 27,121,145 bytes, 134 sec
LZPM 4K: 27,684,163 bytes, 70 sec
I think, the "9" must provide the tightest compressed stream possible. The new Match Finder is faster, but it looses some matches. Maybe I should keep an old Match Finder with 8K index. (LZPM a la LUV-PIMP version).![]()
The old faithful!Originally Posted by encode
![]()
For comparison, results for LUV-PIMP-like:
LZPM 8K: 27,094,740 bytes, 270 sec
Probably with 8K index LZPM should provide the fastest decompression. With 16K, the index table needed for the decompression just larger and equals to 16 MB.
Again vote for numbers!![]()
16K + 128MB or 64MB buffer.
instead of using levels, passing buffer size and index size directly from command line is better, shouldn't it?![]()
Compression level is a different thing, which is not belongs to buffer nor index size. The levels is about parsing strategies only.Originally Posted by roytam1
Like I said, better if index size will be static (hard coded).
16K index is good thing - even for relatively small files (~4 MB) it gives an improvement. This means that smaller indexes not covers even 4 MB efficiently.Originally Posted by LovePimple
Just at the moment Im thinking about (LZPM 0.11):
32 MB buffer (to keep memory on a leash)
8K or 16K - not decided yet, but not smaller
EXE filter will be removed
Will think about match finder more deeply - possibly Ill further improve hashing
A small config list for LZPM 0.11. I hope I will release it soon (within 1-2 weeks - just after I will have no spare time for LZPM).