To get highest results you should "warm" the core before benchmark
Read about this https://habrahabr.ru/post/113682/ (sorry, in Russian).Code:SetProcessAffinityMask(GetCurrentProcess(), 1); volatile int zomg = 1; for ( int i=1; i<1000000000; i++ ) zomg *= i;
Could you please upload your input file? I want to download it to test my LZ77 examples, thanks
Serge
Bulat Ziganshin (12th August 2016),ne0n (27th June 2016)
Hi Inikep,
I really like lzbench. It is a very nice tool for comparing LZ compression performance.
Could you possibly add GLZA? Modified lzbench and new GLZA source code is attached. Also, I wasn't able to get the "-eall" option to test more than one token's worth of files, so beware I put in a hack to get around that problem.
Here's a sample of the top 20 results on 1musk10.txt, sorted by compression ratio:
Edit: I fixed the table and replace the code because I found a memory initialization problem that shows up when a file is repeatedly decoded. There seems to be another bug, glza runs fine but on files over 1 - 3 MB it causes the next program to crash. The cause is not obvious; I checked memory pointers and they look okay but the workmem address seems to be larger. I don't understand much of lzbench code, so I'm not sure what is going on.Code:The results sorted by column number 5: Compressor name Compress. Decompress. Compr. size Ratio Filename glza 0.7.1 0.16 MB/s 25 MB/s 309745 23.03 1musk10.txt csc 3.3 -5 3.27 MB/s 35 MB/s 375672 27.94 1musk10.txt brotli 0.4.0 -11 0.37 MB/s 234 MB/s 382317 28.43 1musk10.txt lzlib 1.7 -6 1.51 MB/s 39 MB/s 385331 28.65 1musk10.txt lzlib 1.7 -9 1.54 MB/s 39 MB/s 386360 28.73 1musk10.txt lzma 9.38 -5 1.58 MB/s 59 MB/s 386490 28.74 1musk10.txt csc 3.3 -3 4.17 MB/s 34 MB/s 386589 28.75 1musk10.txt xz 5.2.2 -6 1.70 MB/s 54 MB/s 386689 28.76 1musk10.txt xz 5.2.2 -9 1.65 MB/s 53 MB/s 386689 28.76 1musk10.txt zstd 0.8.0 -22 1.82 MB/s 397 MB/s 392000 29.15 1musk10.txt zstd 0.8.0 -18 2.22 MB/s 404 MB/s 392716 29.20 1musk10.txt tornado 0.6a -16 1.81 MB/s 124 MB/s 392901 29.22 1musk10.txt tornado 0.6a -13 3.38 MB/s 116 MB/s 411640 30.61 1musk10.txt zstd 0.8.0 -15 3.49 MB/s 412 MB/s 413727 30.77 1musk10.txt lzham 1.0 -d26 -1 1.58 MB/s 140 MB/s 417559 31.05 1musk10.txt zling 2016-01-10 -4 20 MB/s 83 MB/s 417749 31.07 1musk10.txt csc 3.3 -1 11 MB/s 34 MB/s 418998 31.16 1musk10.txt zling 2016-01-10 -3 23 MB/s 83 MB/s 421626 31.35 1musk10.txt tornado 0.6a -10 4.53 MB/s 121 MB/s 421810 31.37 1musk10.txt brotli 0.4.0 -8 5.93 MB/s 244 MB/s 425235 31.62 1musk10.txt
Last edited by Kennon Conrad; 14th August 2016 at 22:36.
Stephan Busch (12th August 2016)
Use CODE tag, Luke!
Last edited by Bulat Ziganshin; 12th August 2016 at 15:12.
Kennon Conrad (12th August 2016)
can anybody please compile it?
Yes. I fixed the problem with GLZA causing other compressors to crash and included a win64 executable this time.
If anyone else wants to compile lzbench with this code, you need to first download the lzbench source code and then replace a few of those and add the glza directory in this attachment. Then run the Makefile and that should do it (assuming minGW and Posix threads are installed).
Last edited by Kennon Conrad; 14th August 2016 at 22:35.
Stephan Busch (12th August 2016)
I've added glza to lzbench (https://github.com/inikep/lzbench/commits/dev) but still testing
1. glza doesn't work with gcc older than 4.9 because of:
2. glza uses many threads and with lzbench's SetPriorityClass/setpriority a computer loses responsiveness therefore there is a new switch:Code:glza/GLZAcompress.c:37:23: fatal error: stdatomic.h: No such file or directory #include <stdatomic.h>
3. still there is a bug after lzbench-master_glza_prototypeB.zipCode:-r = disable real-time process priority
To generate: "lzbench.exe -t0 -u0 -j1000 -r -eglza NEWS" (NEWS is from main directory)Code:FATAL ERROR - dictionary malloc failure
It means 1 compression and 1000 decompression loops.
or "lzbench.exe -r -u2 -eglza NEWS" what means 2 seconds for decompression loops.
Kennon Conrad (12th August 2016)
Is this a problem? Do you know of any simple alternatives?
An area glza could use some work in....
Fixed in lzbench-master_glza_prototypeD, see attached. The glza files and the glza section in compressors.cpp were modified.
Last edited by Kennon Conrad; 14th August 2016 at 22:35. Reason: Fixed bug
Perhaps https://tinycthread.github.io/ might be an option, depending on your needs.
Kennon Conrad (13th August 2016)
Gosh, that's the one thing on a list of things Nemequ gave me to think about that I haven't gotten around to yet. I though it was for multi-threading only and didn't realize it provides atomics (apparently?). I'll have to take a look.
Thanks for adding and testing. I did find one additional problem and replaced compressors.cpp in the prototypeD file. For files that are delta encoded GLZA modifies the input so now the glza section in compressors.cpp creates a local copy of the input to give GLZA. It looks like maybe this should be done through an init function and use workmem, but I'm not exactly sure about the intent. Also, is there something that should be done to limit GLZA to blocks of 2 GB?
This is unusual and unexpected. You should fix it or give a huge warning next to the function declaration.
Moreover I think you should change your functions from
GLZAformat(insize, inbuf, &outsize);
to
GLZAformat(insize, inbuf, &outsize, outbuf);
which will allow to provide an external output buffer. I think all other compressors do that.
You should allocate the buffer only when outbuf==NULL or outsize==0.
Currently there is too many unnecessary malloc() and free() calls.
It wasn't bad when input was never reused, always read from disk. Integrating glza into lzbench has pointed out lots of things that needed to be changed so it has been a very good exercise.
Would this be reasonable?:
I have this working and it seems like it would be most stable for lzbench, allowing glza to handle any temporary ugliness and be improved over time without impacting the lzbench code. On the bad side, the malloc and free calls still exist, but now they are in the glza code instead of the lzbench code. If you think these need to be eliminated, I could add the workmem parameter. I was thinking it wasn't a big deal because compression is already slow but maybe there is more to it than I am aware of.Code:int64_t lzbench_glza_compress(char *inbuf, size_t insize, char *outbuf, size_t outsize, size_t, size_t, char*) { outbuf = (char *)GLZAcomp(insize, (uint8_t *)inbuf, &outsize, (uint8_t *)outbuf); return outsize; }
It's fine for me.
But please remember about checking if memory was allocated because in your code I found some missing checks:
When you are compressing small files it doesn't matter but when we use 1 GB+ files many malloc() and free() can be a problem as it's hard to allocate continuous blocks of memory of this size.Code:int64_t lzbench_glza_compress(char *inbuf, size_t insize, char *outbuf, size_t outsize, size_t, size_t, char*) { char * tempbuf = (char *)malloc(insize); if (!tempbuf) return 0; memcpy(tempbuf, inbuf, insize); inbuf = (char *)GLZAformat(insize, (uint8_t *)tempbuf, &outsize); insize = outsize; free(tempbuf); tempbuf = (char *)GLZAcompress(insize, (uint8_t *)inbuf, &outsize); free(inbuf); if (!tempbuf) return 0; inbuf = tempbuf; insize = outsize; outbuf = (char *)GLZAencode(insize, (uint8_t *)inbuf, &outsize, (uint8_t *)outbuf, (FILE *)0); free(inbuf); if (!outbuf) return 0; return outsize; }
Thanks for the explanation. GLZA uses a lot of memory for compression so if it fails on the I/O malloc it's probably just as well, at least for now.
I added the malloc checks in the GLZA code and changed other exits in the GLZA routines to return a failure (so other programs will still run in lzbench).
Now the compressors.cpp code is this:
and GLZAcomp.o has to be added to the Makefile and the GLZA code replaced.Code:#ifndef BENCH_REMOVE_GLZA #include "glza/GLZAcomp.h" #include "glza/GLZAdecode.h" int64_t lzbench_glza_compress(char *inbuf, size_t insize, char *outbuf, size_t outsize, size_t, size_t, char*) { if (GLZAcomp(insize, (uint8_t *)inbuf, &outsize, (uint8_t *)outbuf, (FILE *)0) == 0) return(0); return outsize; } int64_t lzbench_glza_decompress(char *inbuf, size_t insize, char *outbuf, size_t outsize, size_t, size_t, char*) { if (GLZAdecode(insize, (uint8_t *)inbuf, &outsize, (uint8_t *)outbuf, (FILE *)0) == 0) return(0); return outsize; } #endif
Last edited by Kennon Conrad; 14th August 2016 at 22:34.
Did some more testing and found I added a bug when taking out the exit's. Aarrgghh! "Final" version attached and removing others.
inikep (15th August 2016)
It's already on my "to do" list. I sure some compressors will have to be disabled because it will be too time consuming to fix them for Visual Studio. So far I can recommend MinGW for Windows.
FWIW, as of about 2 weeks ago all the codecs which are enabled by default in Squash work with VS 12+ (mostly thanks to Jørgen Ibsen, who has submitted lots of fixes to various codecs), so it should be relatively straightforward to get them working for you. We also have Windows CI builds set up on AppVeyor, and we've already helped a few other projects (like Brotli and LZFSE) do the same, so hopefully we'll notice any future breakage pretty quickly.
IIRC Density is the only somewhat interesting codec which doesn't work on Windows, but we've had to disable it for security reasons (hence "enabled by default").
inikep (30th August 2016)