Please discuss here my programs (except for FreeArc)
Please discuss here my programs (except for FreeArc)
Last edited by Bulat Ziganshin; 1st January 2009 at 15:04.
And first contribution to the thread: alpha version of Tornado 0.5: http://www.haskell.org/bz/tornado.zip
It has a lot of changes:Main improvements is use of new cmdline/console driver that shows a lot of info and provides almost gzip-like interface, plus new ht5..ht7 match finders. Tests shows the following compression/speed improvements:
- Removed modes -6 -8 -10 -12, renumerated -7/9/11 to -6/7/8
- -7/-8: 1% better compression and 5-20% better speed due to use of ht5
- New -9..-11 modes: 2gb hash and ht5..ht7 match finders
- Support for -l not power of 2 (in all matchfinders?)
- Options -ah, -al, -t#, -cpu, -q, -delete, stats->stderr
- Checks at decoding in order to prevent segfaults
- MSVC compatibility, use of wall-clock times
- Linux: 64-bit file offsets!
There are still problems to fix, in particular -5 compression mode becomes slower, so it's only an alpha. Archive includes Win32 & Linux executables plus full sources. Enjoy!Code:Tornado 0.4 -9 : Size 50.727 mb, speed 7.120 mb/s Tornado 0.5 -7 : Size 50.156 mb, speed 7.570 mb/s Tornado 0.4 -11: Size 48.977 mb, speed 3.557 mb/s Tornado 0.5 -8 : Size 48.819 mb, speed 4.439 mb/s Tornado 0.4 -12: Size 48.325 mb, speed 0.939 mb/s Tornado 0.5 -9 : Size 47.573 mb, speed 3.409 mb/s
ps for benchmakers: modes -10/-11 are pretty experimental now. it's better to use -7, -8 or -9, depending on your RAM amount. you can also use -c1..-c3 codecs together with -7..-9 modes
Thanks. My computer is gonna have a busy weekend, there will be time for this too.
Reducing number of modes is welcome.
Bug: crash with the following command line:
Should be 512m or mb, but crash...is not a good way if informing about incorrect parameters. :PCode:tor -8 -h512 file
Quick tests:
Code:TCUP.tar: 0.4 -12 -c3 99161214 0.5 -8 -c3 98876562 (-0.29%) 0.5 -9 -c3 98802466 (-0.36%) 0.4 -12 -c4 97374193 0.5 -8 -c4 97074630 (-0.31%) 0.5 -9 -c4 97002530 (-0.38%)In -9 I had to reduce buffer and hash sizes in order to make it fit in my memory.Code:Exe.exe: 0.4 -12 -c3 34284810 0.5 -8 -c3 34072634 (-0.62%) 0.5 -9 -c3 34005910 (-0.81%) 0.4 -12 -c4 33729113 0.5 -8 -c4 33528064 (-0.60%) 0.5 -9 -c4 33456108 (-0.81%)
Also, I found some inconsistency: please decide whether you use good ol' kB=1024B or the HDD one. Currently statistics show the second one, while all other places - the first.
One more night - one more test:
(these results, like previous ones, is geometric average of tests on 15 files, including 7 binary and 8 text ones. geom. average filesize = 227.728 mb. my cpu is q6600 @ 3.25 GHz )Code:Tornado 0.5: -1: Size 100.014 mb, speed 220.675 mb/s -2: Size 84.102 mb, speed 166.346 mb/s -3: Size 70.113 mb, speed 102.964 mb/s -4: Size 64.548 mb, speed 48.137 mb/s -5: Size 57.282 mb, speed 23.737 mb/s -6: Size 53.445 mb, speed 13.235 mb/s -7: Size 50.156 mb, speed 7.295 mb/s -8: Size 48.819 mb, speed 4.299 mb/s -9: Size 47.573 mb, speed 3.321 mb/s -10: Size 47.410 mb, speed 3.111 mb/s -11: Size 47.371 mb, speed 2.401 mb/s
>please decide whether you use good ol' kB=1024B or the HDD one. Currently statistics show the second one, while all other places - the first.
i use mib when setting dictionary/hash sizes since they need to be a poers of 2. i use mb for measuring stats
Last edited by Bulat Ziganshin; 11th December 2008 at 14:00.
Hey Bulat, Tornado command line stops sometimes. But It continues to work and compress sucessufully.
Bulat, could you build a full version of Tor?
Hello Bulat, I would like to run Tornado on PowerPC (big endian), and I'm currently working on the code to make it independent from the byte order of the host processor.
[It seems that] I already got modes 5..7 working (can't test higher modes because of not enough RAM). But there are some strange results that drive me crazy and make testing really difficult:
When calling tor in modes 4..6 again and again (always with the same parameters), every few cycles the compressed output file is different in length and content. Some certain output is generated more often than the other variants, but all of them decompress properly. Is this behavior by design?
In mode 7, output seems to stay constant. But it doesn't decompress properly: The output has some 0-bytes appended compared to the original.
All testing has been done with the last alpha version in Linux, also with the binary you provide.
Does somebody encounter similar behavior in Windows?
i can. just thought that it should be interesting for bechmarking only?
what exactly you mean?Tornado command line stops sometimes. But It continues to work and compress sucessufully.
no. looks like some bug like unitialized vars. i had such problem in 4x4:tornado 0.2, but don't know about such problems in tornado itselfWhen calling tor in modes 4..6 again and again (always with the same parameters), every few cycles the compressed output file is different in length and content. Some certain output is generated more often than the other variants, but all of them decompress properly. Is this behavior by design?
btw, i will be glad to get your patches, even if not everything works yet64-bit versions are also welcome
i have checked -6 and -7 - no problems you mention encountered. will check linux today too..
In the meanwhile I did some progress regarding endian independence. With the attached patch, I got at least modes 2..7 (can't test the higher ones) working. Tests are still running, but until now, I could properly decompress all files which I compressed on PowerPC with the x86 Linux binary from your alpha. Fixing mode 1 needs some more effort, because of those many 16 bit pointer casts.
Regarding the "bug" I mentioned in my recent post: It only occurs with very few files. You are probably interested in a test file. I will see, if I can find one that is not too confidential![]()
Last edited by jo.henke; 17th December 2008 at 00:44.
Last edited by jo.henke; 17th December 2008 at 00:47.
With the binary from your alpha and my test file (post #16), you can do something like this in x86 Linux:
Could be, that it takes a few cycles, until differences show up. In mode 5 and 6 the output is similar.Code:$ while linux/tor -4 -q bash; do stat --printf='%s\t' bash.tor; sha1sum bash.tor; rm bash.tor; done 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369326 429cb9df88eac9d0bb1d6dfaa101345439544a75 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369326 429cb9df88eac9d0bb1d6dfaa101345439544a75 bash.tor 369326 a72de5f78598450bc88cfb3c7d1e85e05453f7f7 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369326 429cb9df88eac9d0bb1d6dfaa101345439544a75 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor 369322 e89a3dea0bb8263f7a0ceb19a1c7a087f74a6ee9 bash.tor ^C $
yes, i made just 4 runs![]()
thanks, got the same result!
Updated http://www.haskell.org/bz/tornado.zip
Changes in this alpha:
- Fixed bugs in -4 and -7 modes reported by Joachim Henke
- Added low-endian cpus compatibility, thanks again to Joachim Henke (use ./compile-le)
- Fixed 10% speed loss in -5 mode compared to tornado 0.4
Hi Bulat, thanks a lot for this update and for including my patch!
But there is a misunderstanding: I called my patch tor-le, because it makes Tornado always output little-endian data - independently from the processor's byte order (be it little-endian, big-endian or even mixed-endian).
(because you confused it a bit in your current release: x86 is little-endian and PowerPC is big-endian)
For now it's good, that you included the #ifdefs on the byte-order, because your code (being specific to little-endian) is faster than the one from my patch. But since my code is the generic one (not limited to MOTOROLA_BYTE_ORDER), I suggest this small adjustment:
Your code currently doesn't cause warnings, so I would also propose to remove the options "-Wno-unknown-pragmas -Wno-sign-compare -Wno-conversion" from the compile scripts. This way it's easier to keep it "clean"Code:--- src/Common.h +++ src/Common.h @@ -143,5 +143,5 @@ #define setvalue24(p,x) (*(uint32*)(p) = ((x)&0xffffff)+(*(uint*)(p)&0xff000000)) -#elif FREEARC_MOTOROLA_BYTE_ORDER +#else #define value16(p) getvalue(16, p) --- src/EntropyCoder.cpp +++ src/EntropyCoder.cpp @@ -68,7 +68,5 @@ void put32 (uint c) {*(uint32*)output = c; advance(4);} void put64 (uint64 c) {*(uint64*)output = c; advance(8);} - // Writes machine-size word - void putword(uint c) {*(uint *)output = c; advance(sizeof(uint));} -#elif FREEARC_MOTOROLA_BYTE_ORDER +#else void put8 (uint c) {*output++ = c;} void put16 (uint c) {for (int s = 0; s < 16; s += 8) {*output++ = c; c >>= 8;}} @@ -164,5 +162,5 @@ uint get32 () {fill(); uint n = *(uint32*)input; input+=4; return n;} uint64 get64 () {fill(); uint64 n = *(uint64*)input; input+=8; return n;} -#elif FREEARC_MOTOROLA_BYTE_ORDER +#else uint get16 () {fill(); uint n = *input++; for (int s = 8; s < 16; s += 8) n |= (uint) *input++ << s; return n;} uint get24 () {fill(); uint n = *input++; for (int s = 8; s < 24; s += 8) n |= (uint) *input++ << s; return n;}![]()
Running tor through memcheck from the valgrind suite, I found that it still accesses uninitialized memory. Have a look at the attached output. The outer "if" in CHECK_FOR_DATA_TABLE seems to compare empty fields in the buffer.
1) i don't used #else becuase i fear that people who forget to put -DINTEL on cmdline, will accidentally get slower version. it's better to explicitly control important params
2) i will change history.txt
3) cmdline is stupidly copied around my projects. i will remove unused -W. probably it's from grzip![]()
Updated http://www.haskell.org/bz/tornado.zip
Changes in this alpha:
- Fixed problems with Valgrind
- -3 now has larger hash (64k->128k) and enable delta filter again
- Improved lazy matching a bit
- fixed -oFILE bug when FILE already exists
- added dynamic LZ77 coder so now 1) executables becomes smaller, 2) -c1..-c3 available in -10 and -11 modes in small tornado.exe
>you already have such a barrier in Common.h
i still like to have excessive checks ather than probability that someone will accidentally get slower executable
Thanks a lot for this update!
The improved mode 3 is amazing. I did some tests with a cpio archive (3.5 GiB) which contains a Linux root file system. Here, tor -3 beats gzip -6 in compression ratio, while being almost 5 times faster (and nearly 2x faster than gzip -1).
Keep up your great work![]()
Btw, I recomment to add something liketo avoid linker errors when compiling with "-DFREEARC_NO_TIMING" but without "-fwhole-program -lrt". This might be essential for users of older GCC versions or different compilers.Code:--- src/Common.cpp +++ src/Common.cpp @@ -357,4 +357,5 @@ #ifdef FREEARC_UNIX +#ifndef FREEARC_NO_TIMING // Returns number of wall-clock seconds since some moment double GetGlobalTime (void) @@ -374,4 +375,5 @@ return res? -1 : (ts.tv_sec + ((double)ts.tv_nsec) / 1000000000); } +#endif #endif // FREEARC_UNIX