Results 1 to 24 of 24

Thread: Tornado - fast lzari compressor

  1. #1
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    i'm glad to present my new work this program is intended to the same niche as THOR and it seems that it finally slightly outperformed this great packer it includes two novel ideas - quad-style match finder and fast arithmetic coder for frequencies. more on this tomorrow, now just try it yourself

    quick note about compression modes: -1 to -12 are supported. the following table shows closest modes of other programs:

    -1 QuickLZ, thor ef
    -2 thor e
    -3 thor ex
    -5 gzip -5, rar -m1
    -7 gzip -9, thor exx
    -9 7zip -mx1

    there are several tunable parameters, options -1..-12 just selects their predefined sets. you can tune compression mode manually by changing parameters after mode specified, f.e. "tor -9 -c2"

  2. #2
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    i was sllepy and forget to include url: http://www.haskell.org/bz/tor.rar

  3. #3
    Member
    Join Date
    Apr 2007
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Bulat,

    Here come some preliminary results and observations.
    My test set contained some doc, pdf, ps, tex, jpg, and zip files. The total size of the test set is 26306560 bytes. Varying the predefined compression profiles from 1 to 9, and keeping the default values for the coder and block size I got

    16kb hash1, block 1mb: 26306 -> 21232 kb (80.7%): 0.954 sec
    64kb hash2, block 2mb: 26306 -> 20825 kb (79.2%): 1.056 sec
    64kb hash2, block 4mb: 26306 -> 20816 kb (79.1%): 1.109 sec
    128kb hash2, block 8mb: 26306 -> 20558 kb (78.2%): 1.134 sec
    1mb hash2, block 16mb: 26306 -> 19484 kb (74.1%): 1.625 sec
    4mb hash4x, block 25mb: 26306 -> 19121 kb (72.7%): 3.386 sec
    8mb hash8x, block 25mb: 26306 -> 19056 kb (72.4%): 7.560 sec
    16mb hash16x, block 25mb: 26306 -> 19036 kb (72.4%): 12.201 s
    32mb hash32x, block 25mb: 26306 -> 19028 kb (72.3%): 17.613 s
    64mb hash64x, block 25mb: 26306 -> 19026 kb (72.3%): 27.333 s
    128mb hash128x, block 25mb: 26306 -> 19027 kb (72.3%): 48.243 s
    256mb hash256x, block 25mb: 26306 -> 19027 kb (72.3%): 91.539 s

    For comparison purposes the same file was compressed by thor v0.93a:
    RATIO: 79.00% 6.320 bpB; TIME: 2sec 554msec; SPEED: 9.82 MB/sec

    Observations.
    Thor is indeed outperformed in both speed and compression ratio (well done!)
    The last four compressions modes had nearly the same compression ratios
    Compression time does not appear to grow exponentially (great!)

    In summary: great work, Bulat! I also hope that this new compressor will leave some of your free time for further development of FreeArc!

  4. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    1. i think that inclusion jpeg/zip in test is meaningless, at least when comparing archivers which doesn't include special algorithms for such type of files

    2. it seems that you tested thor in exx mode while its best mode imho is ex

    3. the -10..-12 compression modes exists mainly to test speed of larger hash rows (for comparison, 7z doesn't check more than 48 elements even with ultra setting compared to 256 elememts with my -12). i guess that compression doesn't improve in these modes because i don't decline far matches with len=4

    4. speed in fast modes (those comparble with thor) greatly depends on cpu cache size. for example, from your tests i guess that your cpu has 1mb, or more probably 2mb cache

    5. compression time in last modes grows exponentially, with ~1.5x coefficient for 2x larger hash row, even more in your test

    6. this codec was intended exact for fast freearc compression modes. if thor sources was available, i don't worried about writing my own fast compressor


    reading your test results, i see some very strange thing: it's not my program! at least not the evrsion i compiled and uploaded this night. first lines of bench.cmd output should look as:

    16kb hash1, block 1mb, bytecoder: compressed 239919 -> 84283 kb (35.1%): 3.078 seconds
    64kb hash2, block 2mb, bitcoder: compressed 239919 -> 62999 kb (26.3%): 6.533 seconds
    64kb hash2, block 4mb, aricoder: compressed 239919 -> 56869 kb (23.7%): 9.545 seconds

    notice encoder here: bytecoder/bitcoder/aricoder. also noice that speed significantly changes due to use of various encoders. it seems that in your test only one fixed codec was used. are you modified program sources?

  5. #5
    Member
    Join Date
    Apr 2007
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Bulat Ziganshin
    1. i think that inclusion jpeg/zip in test is meaningless, at least when comparing archivers which doesnt include special algorithms for such type of files
    I would guess so too. Jpeg/zip files just appeared to be in the testset.
    Quote Originally Posted by Bulat Ziganshin
    2. it seems that you tested thor in exx mode while its best mode imho is ex
    Agreed.
    Quote Originally Posted by Bulat Ziganshin
    4. speed in fast modes (those comparble with thor) greatly depends on cpu cache size. for example, from your tests i guess that your cpu has 1mb, or more probably 2mb cache
    Its 2MB.
    Quote Originally Posted by Bulat Ziganshin
    5. compression time in last modes grows exponentially, with ~1.5x coefficient for 2x larger hash row, even more in your test
    I guess I dont fully understand the def of exp growth.

    Quote Originally Posted by Bulat Ziganshin
    6. this codec was intended exact for fast freearc compression modes. if thor sources was available, i dont worried about writing my own fast compressor
    Oh, good.

    Quote Originally Posted by Bulat Ziganshin
    reading your test results, i see some very strange thing: its not my program! ... are you modified program sources?
    No, it is your program. I simply did some editing of the output.

  6. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by Awe
    Its 2MB.
    optimal hash size for very fast compression is 1/4-1/8 of cache size. so on you cpu -4 -h17 will be great

    Quote Originally Posted by Awe
    I simply did some editing of the output.
    no problem, but it seems that you also tested it with -c4 switch even in fastest compression modes. at least these modes are too slow in your results and has too good compression. in particular, main change between -2 and -3 modes are using aricoder instead of bitcoder. the actual -1 mode should be 2-3 times faster that in your test

  7. #7
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by Awe
    I guess I dont fully understand the def of exp growth. [?????????????]
    program speed (in higher modes) is somethat like const*1.5^N. where N is size of hash line (-l parameter). it looks like exponential growth

  8. #8
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Bulat Ziganshin
    im glad to present my new work this program is intended to the same niche as THOR and it seems that it finally slightly outperformed this great packer it includes two novel ideas - quad-style match finder and fast arithmetic coder for frequencies. more on this tomorrow, now just try it yourself
    Thanks Bulat!

  9. #9
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quick test...

    MC SFC test:

    TOR -12

    A10.jpg > 842,477
    AcroRd32.exe > 1,828,645
    english.dic > 1,271,875
    FlashMX.pdf > 3,838,649
    FP.LOG > 1,177,834
    MSO97.DLL > 2,204,137
    ohs.doc > 893,712
    rafale.bmp > 1,284,231
    vcfiu.hlp > 859,099
    world95.txt > 723,309

    Total = 14,923,968 bytes


    ENWIK8:

    THOR EF
    Compressed Size = 52.5 MB (55,063,944 bytes)
    Compression Time = 00:00:02.953

    QuickLZ v1.20
    Compressed Size = 54.4 MB (57,147,067 bytes)
    Compression Time = 00:00:02.828

    TOR -1
    Compressed Size = 57.2 MB (60,063,659 bytes)
    Compression Time = 00:00:02.703


    THOR E
    Compressed Size = 43.7 MB (45,842,692 bytes)
    Compression Time = 00:00:03.688

    LZOP v1.02rc1 -1
    Compressed Size = 50.9 MB (53,437,564 bytes)
    Compression Time = 00:00:04.562

    TOR -2
    Compressed Size = 42.7 MB (44,863,800 bytes)
    Compression Time = 00:00:04.015


    THOR EX
    Compressed Size = 39.7 MB (41,670,916 bytes)
    Compression Time = 00:00:06.000

    TOR -3
    Compressed Size = 38.2 MB (40,102,541 bytes)
    Compression Time = 00:00:05.406

    TOR -4
    Compressed Size = 37.5 MB (39,419,749 bytes)
    Global Time = 00:00:06.906

    TOR -5
    Compressed Size = 36.5 MB (38,278,003 bytes)
    Global Time = 00:00:12.593

    TOR -6
    Compressed Size = 35.2 MB (36,990,054 bytes)
    Global Time = 00:00:17.094

    THOR EXX
    Compressed Size = 34.0 MB (35,696,060 bytes)
    Compression Time = 00:00:21.437

    TOR -7
    Compressed Size = 34.2 MB (35,959,668 bytes)
    Compression Time = 00:00:26.156

    TOR -8
    Compressed Size = 33.4 MB (35,120,351 bytes)
    Compression Time = 00:00:38.328

    LZOP v1.02rc1 -9
    Compressed Size = 39.3 MB (41,217,688 bytes)
    Compression Time = 00:01:01.453

    TOR -9
    Compressed Size = 32.8 MB (34,491,218 bytes)
    Compression Time = 00:01:04.421

    TOR -10
    Compressed Size = 32.4 MB (34,009,747 bytes)
    Compression Time = 00:01:53.375


    TOR -11
    Compressed Size = 32.0 MB (33,629,911 bytes)
    Compression Time = 00:03:24.140


    TOR -12
    Compressed Size = 31.8 MB (33,348,632 bytes)
    Compression Time = 00:06:18.562




    TOR -mem enwik8

    Benchmarking 100000 kb, aricoder: 0%
    Benchmarked 100000 kb, aricoder
    16kb hash1: 44346 kb (44.3%): 3.345 seconds
    32kb hash1: 43937 kb (43.9%): 3.476 seconds
    64kb hash1: 43834 kb (43.8%): 4.355 seconds

    16kb hash2: 41950 kb (42.0%): 3.768 seconds
    32kb hash2: 40823 kb (40.8%): 3.856 seconds
    64kb hash2: 39964 kb (40.0%): 4.332 seconds
    128kb hash2: 39350 kb (39.4%): 5.874 seconds
    256kb hash2: 38856 kb (38.9%): 8.562 seconds
    512kb hash2: 38498 kb (38.5%): 10.554 seconds
    1mb hash2: 38243 kb (38.2%): 11.688 seconds
    4mb hash2: 37932 kb (37.9%): 12.544 seconds
    16mb hash2: 37807 kb (37.8%): 14.463 seconds

    1mb hash4x: 37569 kb (37.6%): 13.442 seconds
    4mb hash4x: 36967 kb (37.0%): 15.700 seconds
    16mb hash4x: 36679 kb (36.7%): 17.177 seconds

    1mb hash8x: 36963 kb (37.0%): 21.353 seconds
    4mb hash8x: 36159 kb (36.2%): 24.032 seconds
    16mb hash8x: 35789 kb (35.8%): 25.209 seconds

    1mb hash16x: 36574 kb (36.6%): 30.193 seconds
    4mb hash16x: 35613 kb (35.6%): 34.017 seconds
    16mb hash16x: 35120 kb (35.1%): 36.546 seconds

    1mb hash32x: 36368 kb (36.4%): 48.963 seconds
    4mb hash32x: 35284 kb (35.3%): 55.697 seconds
    16mb hash32x: 34636 kb (34.6%): 59.357 seconds




    TOR -mem scribble.wav

    Benchmarking 9249 kb, aricoder: 0%
    Benchmarked 9249 kb, aricoder
    16kb hash1: 236 kb (2.6%): 0.041 seconds
    32kb hash1: 236 kb (2.6%): 0.041 seconds
    64kb hash1: 236 kb (2.6%): 0.043 seconds

    16kb hash2: 136 kb (1.5%): 0.043 seconds
    32kb hash2: 136 kb (1.5%): 0.044 seconds
    64kb hash2: 136 kb (1.5%): 0.043 seconds
    128kb hash2: 136 kb (1.5%): 0.043 seconds
    256kb hash2: 136 kb (1.5%): 0.050 seconds
    512kb hash2: 136 kb (1.5%): 0.044 seconds
    1mb hash2: 136 kb (1.5%): 0.047 seconds
    4mb hash2: 136 kb (1.5%): 0.053 seconds
    16mb hash2: 136 kb (1.5%): 0.086 seconds

    1mb hash4x: 115 kb (1.3%): 0.055 seconds
    4mb hash4x: 115 kb (1.3%): 0.064 seconds
    16mb hash4x: 115 kb (1.3%): 0.097 seconds

    1mb hash8x: 98 kb (1.1%): 0.062 seconds
    4mb hash8x: 98 kb (1.1%): 0.070 seconds
    16mb hash8x: 98 kb (1.1%): 0.101 seconds

    1mb hash16x: 85 kb (0.9%): 0.070 seconds
    4mb hash16x: 85 kb (0.9%): 0.075 seconds
    16mb hash16x: 85 kb (0.9%): 0.137 seconds

    1mb hash32x: 75 kb (0.8%): 0.086 seconds
    4mb hash32x: 75 kb (0.8%): 0.094 seconds
    16mb hash32x: 75 kb (0.8%): 0.137 seconds

  10. #10
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    My testset tarred into 30,318,592 bytes

    TORNADO:
    16kb hash1, block 1mb, bytecoder: compressed 30318 -> 23630 kb (77.9%): 0.390 seconds
    64kb hash2, block 2mb, bitcoder: compressed 30318 -> 21617 kb (71.3%): 0.641 seconds
    64kb hash2, block 4mb, aricoder: compressed 30318 -> 19210 kb (63.4%): 1.078 seconds
    128kb hash2, block 8mb, aricoder: compressed 30318 -> 19052 kb (62.8%): 1.219 seconds
    1mb hash2, block 16mb, aricoder: compressed 30318 -> 15052 kb (49.6%): 2.437 seconds
    4mb hash4x, block 28mb, aricoder: compressed 30318 -> 14620 kb (48.2%): 2.422 seconds
    8mb hash8x, block 28mb, aricoder: compressed 30318 -> 14566 kb (48.0%): 3.953 seconds
    16mb hash16x, block 28mb, aricoder: compressed 30318 -> 14540 kb (48.0%): 5.110 seconds
    32mb hash32x, block 28mb, aricoder: compressed 30318 -> 14528 kb (47.9%): 7.031 seconds
    64mb hash64x, block 28mb, aricoder: compressed 30318 -> 14533 kb (47.9%): 10.781 seconds
    128mb hash128x, block 28mb, aricoder: compressed 30318 -> 14534 kb (47.9%): 17.688 seconds
    256mb hash256x, block 28mb, aricoder: compressed 30318 -> 14532 kb (47.9%): 30.703 seconds

    THOR (original filesize = 30,318,592):
    ef_ ->21,384,924 B / RATIO: 70.53% / 5.643 bpB / 0sec 516msec / SPEED: 56.04 MB/sec
    e__ ->19,814,532 B / RATIO: 65.35% / 5.228 bpB / 0sec 750msec / SPEED: 38.55 MB/sec
    ex_ ->16,756,052 B / RATIO: 55.27% / 4.421 bpB / 1sec 265msec / SPEED: 22.86 MB/sec
    exx ->18,040,612 B / RATIO: 59.50% / 4.760 bpB / 3sec 328msec / SPEED: _8.69 MB/sec

  11. #11
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    Black_Fox
    probably your test of thor ef includes disk read time, it's speed should be about 50-60 mb/s

    also i should say that my program doesn't include i/o times in its output which is important for fastest modes. this 30 mb file, for example, needs about 0.1 seconds to be read from cache and you should either add this time to my program's stats or substract it from thor's stats

  12. #12
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by Bulat Ziganshin
    probably your test of thor ef includes disk read time, its speed should be about 50-60 mb/s
    I rerun all tests twice and updated the numbers - tor -1 to -8 was a bit slower and -9 to -12 a bit faster than previously written (usually not more than 0.06s) . Thor -ef changed a lot

  13. #13
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    Bravo Bulat! Tornado is awesome! I've been waiting for Thor competitor for a long time, may be it's good that Thor is closesourced

    Here are my test results. I've compressed full installed NERO 7 Ultimate folder to zero compressed 7-ZIP archive (410,798,906 bytes). It contains mainly binary files (~240 Mb), uncompressed images(~40 Mb), zip and 7z archives(~40 Mb) and some videofiles. I tested Tornado, Thor, QuickLZ, WinRAR's zip, WinRAR's rar and UHARC's lzp. I provide Igor Pavlov's Timer results for QuickLZ, WinRAR's zip, WinRAR's because can't get their time other way, and for Tornado, because it seems to understate it's time results Results are sorted descending by compression ratio.

    As you'll see, Tornado defenitely has problems with IO perfomance, however as well as Thor, LZOP and QuickLZ. If you'll track Timer results you'll see that in spite of the fact that it's User Time is similar to that, which Tornado shows, Global Time is essentially bigger. You may study LZOP or QuickLZ sources for fastest IO routines they are famous for. May be this may help. Never the less, Thor is beaten! Let's salute new Speed King!

    tor 12
    256mb hash256x, block 128mb, aricoder: compressing 410798 kb:
    256mb hash256x, block 128mb, aricoder: compressed 410798 -> 158887 kb (38.7%): 275.145 seconds
    Global Time = 290.328

    tor 11
    128mb hash128x, block 128mb, aricoder: compressing 410798 kb:
    128mb hash128x, block 128mb, aricoder: compressed 410798 -> 159042 kb (38.7%): 178.755 seconds
    Global Time = 193.750

    UHARC -mz -md32768
    COMPRESSED: 159,769,062 bytes
    TIME: 80.5 sec

    tor 10
    64mb hash64x, block 128mb, aricoder: compressing 410798 kb:
    64mb hash64x, block 128mb, aricoder: compressed 410798 -> 161380 kb (39.3%): 123.363 seconds
    Global Time = 137.703

    tor 9
    32mb hash32x, block 128mb, aricoder: compressing 410798 kb:
    32mb hash32x, block 128mb, aricoder: compressed 410798 -> 162409 kb (39.5%): 83.219 seconds
    Global Time = 97.485

    tor 8
    16mb hash16x, block 128mb, aricoder: compressing 410798 kb:
    16mb hash16x, block 128mb, aricoder: compressed 410798 -> 163194 kb (39.7%): 56.151 seconds
    Global Time = 70.562

    tor 7
    8mb hash8x, block 64mb, aricoder: compressing 410798 kb:
    8mb hash8x, block 64mb, aricoder: compressed 410798 -> 164655 kb (40.1%): 35.698 seconds
    Global Time = 48.860

    tor 6
    4mb hash4x, block 32mb, aricoder: compressing 410798 kb:
    4mb hash4x, block 32mb, aricoder: compressed 410798 -> 167455 kb (40.8%): 23.736 seconds
    Global Time = 59.125

    tor 5
    1mb hash2, block 16mb, aricoder: compressing 410798 kb:
    1mb hash2, block 16mb, aricoder: compressed 410798 -> 172627 kb (42.0%): 19.931 seconds
    Global Time = 84.656

    tor 5 -c4 -h17
    512kb hash2, block 16mb, aricoder: compressing 410798 kb:
    512kb hash2, block 16mb, aricoder: compressed 410798 -> 176440 kb (43.0%): 14.978 seconds
    Global Time = 79.562

    tor 4 -c4 -h17
    512kb hash2, block 8mb, aricoder: compressing 410798 kb:
    512kb hash2, block 8mb, aricoder: compressed 410798 -> 178565 kb (43.5%): 14.967 seconds
    Global Time = 31.016

    winrar rar m1
    COMPRESSED: 181,386,743 bytes
    TIME: 33.984 sec

    tor 3 -c4 -h17
    512kb hash2, block 4mb, aricoder: compressing 410798 kb:
    512kb hash2, block 4mb, aricoder: compressed 410798 -> 181756 kb (44.2%): 14.163 seconds
    Global Time = 29.469

    THOR ex
    COMPRESSED: 183,898,460 bytes
    TIME: 33.313 sec

    tor 2 -c4 -h17
    512kb hash2, block 2mb, aricoder: compressing 410798 kb:
    512kb hash2, block 2mb, aricoder: compressed 410798 -> 188482 kb (45.9%): 14.846 seconds
    Global Time = 37.234

    tor 6 -c4 -h17
    512kb hash4x, block 32mb, aricoder: compressing 410798 kb:
    512kb hash4x, block 32mb, aricoder: compressed 410798 -> 188951 kb (46.0%): 14.025 seconds
    Global Time = 26.500

    tor 7 -c4 -h17
    512kb hash8x, block 64mb, aricoder: compressing 410798 kb:
    512kb hash8x, block 64mb, aricoder: compressed 410798 -> 189550 kb (46.1%): 18.021 seconds
    Global Time = 31.781

    tor 4
    128kb hash2, block 8mb, aricoder: compressing 410798 kb:
    128kb hash2, block 8mb, aricoder: compressed 410798 -> 193682 kb (47.1%): 12.213 seconds
    Global Time = 29.000

    THOR exx
    COMPRESSED: 193,754,672 bytes
    TIME: 44.610 sec

    winrar zip m5
    COMPRESSED: 195,695,872 bytes
    TIME: 80.985 sec

    winrar zip m3
    COMPRESSED: 196,498,997 bytes
    TIME: 44.063 sec

    winrar zip m2
    COMPRESSED: 200,908,824 bytes
    TIME: 32.032 sec

    tor 1 -c4 -h17
    512kb hash1, block 1mb, aricoder: compressing 410798 kb:
    512kb hash1, block 1mb, aricoder: compressed 410798 -> 202042 kb (49.2%): 13.823 seconds
    Global Time = 35.454

    tor 3
    64kb hash2, block 4mb, aricoder: compressing 410798 kb:
    64kb hash2, block 4mb, aricoder: compressed 410798 -> 202200 kb (49.2%): 12.369 seconds
    Global Time = 25.782

    THOR e
    COMPRESSED: 214,805,864 bytes
    TIME: 28.125 sec

    tor 2
    64kb hash2, block 2mb, bitcoder: compressing 410798 kb:
    64kb hash2, block 2mb, bitcoder: compressed 410798 -> 216092 kb (52.6%): 7.626 seconds
    Global Time = 31.875

    QuickLZ
    COMPRESSED: 235,535,346 bytes
    TIME: 21.938 sec

    tor 1
    16kb hash1, block 1mb, bytecoder: compressing 410798 kb:
    16kb hash1, block 1mb, bytecoder: compressed 410798 -> 249646 kb (60.8%): 4.922 seconds
    Global Time = 38.531




    Besides, here is my "just for fun" test on Nero's main exe:

    tor -mem nero.exe

    Benchmarked 36003 kb, aricoder
    16kb hash1: 13686 kb (38.0%): 0.667 seconds
    32kb hash1: 12696 kb (35.3%): 0.582 seconds
    64kb hash1: 11795 kb (32.8%): 0.565 seconds

    16kb hash2: 13193 kb (36.6%): 0.715 seconds
    32kb hash2: 12846 kb (35.7%): 0.702 seconds
    64kb hash2: 11977 kb (33.3%): 0.695 seconds
    128kb hash2: 11119 kb (30.9%): 0.720 seconds
    256kb hash2: 10015 kb (27.8%): 0.702 seconds
    512kb hash2: 9297 kb (25.8%): 0.760 seconds
    1mb hash2: 8606 kb (23.9%): 0.978 seconds
    4mb hash2: 8249 kb (22.9%): 1.511 seconds
    16mb hash2: 8238 kb (22.9%): 1.507 seconds

    1mb hash4x: 9521 kb (26.4%): 0.864 seconds
    4mb hash4x: 8221 kb (22.8%): 1.169 seconds
    16mb hash4x: 8081 kb (22.4%): 1.311 seconds

    1mb hash8x: 9965 kb (27.7%): 1.142 seconds
    4mb hash8x: 8200 kb (22.8%): 1.576 seconds
    16mb hash8x: 7969 kb (22.1%): 1.793 seconds

    1mb hash16x: 10107 kb (28.1%): 1.659 seconds
    4mb hash16x: 8381 kb (23.3%): 2.320 seconds
    16mb hash16x: 7906 kb (22.0%): 2.609 seconds

    1mb hash32x: 10395 kb (28.9%): 2.383 seconds
    4mb hash32x: 8679 kb (24.1%): 3.529 seconds
    16mb hash32x: 7860 kb (21.8%): 3.871 seconds

  14. #14
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    I've forgot to describe my test PC. As CPU-Z informs it's:

    Dualcore Pentium D 945 2x3400 MHz
    L1 cache data - 2x16 Kb
    L1 cache trace - 2x12 Kuops
    L2 cache - 2x2048 Kb

    Mainboard - Intel D945PSN

    Memory - Dualchannel 1Gb DDR2 (4-4-4-11-15) 266,7 MHz

  15. #15
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    if anyone interested, here is my own results:

    various text files: sources, html docs, russian SF. 239919 kb
    16kb hash1, block 1mb, bytecoder: 84283 kb (35.1%): 3.078 seconds
    64kb hash2, block 2mb, bitcoder: 62999 kb (26.3%): 6.533 seconds
    64kb hash2, block 4mb, aricoder: 56869 kb (23.7%): 9.545 seconds
    128kb hash2, block 8mb, aricoder: 55650 kb (23.2%): 12.239 seconds
    1mb hash2, block 16mb, aricoder: 53085 kb (22.1%): 14.987 seconds
    4mb hash4x, block 32mb, aricoder: 50575 kb (21.1%): 20.079 seconds
    8mb hash8x, block 64mb, aricoder: 48400 kb (20.2%): 30.344 seconds
    16mb hash16x, block 128mb, aricoder: 46497 kb (19.4%): 42.684 seconds
    32mb hash32x, block 128mb, aricoder: 45331 kb (18.9%): 65.147 seconds
    64mb hash64x, block 128mb, aricoder: 44504 kb (18.5%): 104.897 seconds
    128mb hash128x, block 128mb, aricoder: 43970 kb (18.3%): 180.505 seconds
    256mb hash256x, block 128mb, aricoder: 43634 kb (18.2%): 328.466 seconds

    QuickLZ
    1.10 3.825s 85.6mb
    1.20 5.838s 84.5mb

    THOR
    ef 5.5s 80.3mb
    e 7.3s 67.7mb
    ex 11.5s 57.8mb
    exx 67.5s 50.1mb 7.2s extraction

    LZOP
    -1 6.9s 80.8mb 1.45s extraction
    -5 5.6s 81.6mb
    -7 167.2s 60.5mb
    -9 246.7s 59.8mb 1.24s

    GZIP 1.3.5
    -1 21.5s 65.6mb 5.6s
    -5 34.7s 53.3mb
    -9 64.5s 51.7mb 4.6s

    RAR
    m1 31.9s 57.8mb
    m2 195.7s 41.3mb

    UHA
    -mz 70.1s 40.9mb

    ARC
    m1 12.0s 111.0mb
    m2 68.0s 37.5mb
    m2x 107.0s 38.9mb -md16m, 17s extraction

    CAB
    768.0s 39.1mb 3.9s extraction

    QUAD 1.12
    125.6s 40.4mb

    7z
    -mx1 104.5s 48.4mb
    -mx3 142.8s 44.6mb
    -mx5 537.4s 37.4mb 13.3s extraction


    OOo_2.0.2rc4_src.tar - OpenOffice sources. 368738 kb
    16kb hash1, block 1mb, bytecoder: 143355 kb (38.9%): 4.869 seconds
    64kb hash2, block 2mb, bitcoder: 116472 kb (31.6%): 9.664 seconds
    64kb hash2, block 4mb, aricoder: 107115 kb (29.0%): 15.340 seconds
    128kb hash2, block 8mb, aricoder: 101946 kb (27.6%): 23.058 seconds
    1mb hash2, block 16mb, aricoder: 92779 kb (25.2%): 33.408 seconds
    4mb hash4x, block 32mb, aricoder: 89427 kb (24.3%): 31.620 seconds
    8mb hash8x, block 64mb, aricoder: 87456 kb (23.7%): 48.352 seconds
    16mb hash16x, block 128mb, aricoder: 86238 kb (23.4%): 66.695 seconds
    32mb hash32x, block 128mb, aricoder: 85636 kb (23.2%): 103.856 seconds
    64mb hash64x, block 128mb, aricoder: 85401 kb (23.2%): 170.340 seconds
    128mb hash128x, block 128mb, aricoder: 85335 kb (23.1%): 296.368 seconds
    256mb hash256x, block 128mb, aricoder: 85349 kb (23.1%): 546.424 seconds

    ef 7.8s 128.4mb
    e 10.5s 116.5mb
    ex 28.7s 100.3mb
    exx 81.7s 103.1mb


    dll250 - windows dlls. 250000 kb
    16kb hash1, block 1mb, bytecoder: 157955 kb (63.2%): 4.481 seconds
    64kb hash2, block 2mb, bitcoder: 127922 kb (51.2%): 10.994 seconds
    64kb hash2, block 4mb, aricoder: 115825 kb (46.3%): 17.631 seconds
    128kb hash2, block 8mb, aricoder: 113465 kb (45.4%): 28.215 seconds
    1mb hash2, block 16mb, aricoder: 105355 kb (42.1%): 38.802 seconds
    4mb hash4x, block 32mb, aricoder: 102269 kb (40.9%): 38.171 seconds
    8mb hash8x, block 64mb, aricoder: 100215 kb (40.1%): 59.063 seconds
    16mb hash16x, block 128mb, aricoder: 98272 kb (39.3%): 80.051 seconds
    32mb hash32x, block 128mb, aricoder: 97838 kb (39.1%): 123.538 seconds
    64mb hash64x, block 128mb, aricoder: 97761 kb (39.1%): 198.844 seconds
    128mb hash128x, block 128mb, aricoder: 97309 kb (38.9%): 338.375 seconds
    256mb hash256x, block 128mb, aricoder: 97211 kb (38.9%): 618.847 seconds

    ef 7.4s 133.3mb
    e 10.8s 124.9mb
    ex 27.2s 113.9mb
    exx 88.5s 103.9mb


    linux-2.6.14.5.tar - linux kernel sources. 224102 kb
    16kb hash1, block 1mb, bytecoder: 81088 kb (36.2%): 3.015 seconds
    64kb hash2, block 2mb, bitcoder: 60336 kb (26.9%): 6.531 seconds
    64kb hash2, block 4mb, aricoder: 55078 kb (24.6%): 9.481 seconds
    128kb hash2, block 8mb, aricoder: 53797 kb (24.0%): 11.486 seconds
    1mb hash2, block 16mb, aricoder: 52180 kb (23.3%): 13.531 seconds
    4mb hash4x, block 32mb, aricoder: 49640 kb (22.2%): 18.995 seconds
    8mb hash8x, block 64mb, aricoder: 48018 kb (21.4%): 27.594 seconds
    16mb hash16x, block 128mb, aricoder: 46957 kb (21.0%): 39.720 seconds
    32mb hash32x, block 128mb, aricoder: 46302 kb (20.7%): 61.910 seconds
    64mb hash64x, block 128mb, aricoder: 45975 kb (20.5%): 102.036 seconds
    128mb hash128x, block 128mb, aricoder: 45833 kb (20.5%): 179.612 seconds
    256mb hash256x, block 128mb, aricoder: 45802 kb (20.4%): 333.537 seconds

    ef 5.1s 72.0mb
    e 6.8s 62.0mb
    ex 10.8s 52.0mb
    exx 52.6s 47.2mb


    cpu is duron 1193 (special model for hackers only )

  16. #16
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by nimdamsk
    Thor is beaten! Lets salute new Speed King!


    Is it possible to make Tornado even faster?

  17. #17
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    no. nor i think that qlz was beaten. algorithms are pretty close but tor was not hand-optimized for max speed, so the results are pretty close. in my test, qlz 1.10 was 10% faster

    i don't think that making program especially for records interesting and anyway i don't have any ideas here - both algos are lzrw1, the only thing that is probably better in my program is multiplying hash function (like in quad)

  18. #18
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Aswering on your compression.ru's question.

    How to mix arith and bit I/O or how to store bits via range encoder:

    You may just store any value in stream:

    encoder.Encode(0, 1, 2); // store bit 0
    encoder.Encode(1, 1, 2); // store bit 1

    // s = 8-bit symbol
    encoder.Encode(s, 1, 256); // store raw 8-bit value

    // s = 10-bit symbol
    encoder.Encode(s, 1, 1024); // store raw 10-bit value

    etc. etc. - i.e. it's very easy!

  19. #19
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    encode

    it is how i implemented it now but it is slower than direct writing to bitstream which need only a few operations (see BitStream classes in my program). just now aricoder is 2-3x slower than bitcoder. by writing bit fields directly to bitstream i can mix these results, so resulting aricoder will be only 1.5-2x slower than bitcoder while maintating exactly the same efficiency

    Quote Originally Posted by LovePimple
    What about QuickLZ v1.20?
    it is 1.5x slower on my test!

    Quote Originally Posted by Bulat Ziganshin
    QuickLZ
    1.10 3.825s 85.6mb
    1.20 5.838s 84.5mb
    i dont understand why but results are the same among various files. its -mem mode shows a 5% improvement, though. one possible explanation is that it was optimized for quicker i/o (and with real i/o new version really works faster) and this beaten its memory-only performance. my usual test is compressing cached file to nul, so any disk i/o isnt taken into account

    i think that any real tests should be made with real i/o and tor definitely looses in this area. obviously its too early to take care of it

  20. #20
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    as far as i understood thor was written in delphi using assembler injections. it's last versions were tuned for fast i/o, because bottleneck was there. afaik linux has better file and memory i/o managers than windows, so it would be interesting to see tornado in linux.

  21. #21
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quick comparison (-mem) QuickLZ versions 1.10 and 1.20...

    Test Machine: AMD Sempron 2400+

    qlz120 -mem fp.log
    Reading source file...
    Setting HIGH_PRIORITY_CLASS...
    Benchmarking, please wait...

    Compressed at 193.2 Mbyte/s.

    Decompressed at 270.5 Mbyte/s.

    (1 MB = 1000000 byte)


    qlz110 -mem fp.log
    Reading source file...
    Setting HIGH_PRIORITY_CLASS...
    Benchmarking...

    Compressed at 188.5 Mbyte/s.

    Decompressed at 271.5 Mbyte/s.



    qlz120 -mem rafale.bmp
    Reading source file...
    Setting HIGH_PRIORITY_CLASS...
    Benchmarking, please wait...

    Compressed at 89.3 Mbyte/s.

    Decompressed at 169.1 Mbyte/s.

    (1 MB = 1000000 byte)


    qlz110 -mem rafale.bmp
    Reading source file...
    Setting HIGH_PRIORITY_CLASS...
    Benchmarking...

    Compressed at 89.8 Mbyte/s.

    Decompressed at 166.0 Mbyte/s.



    qlz120 -mem vcfiu.hlp
    Reading source file...
    Setting HIGH_PRIORITY_CLASS...
    Benchmarking, please wait...

    Compressed at 123.6 Mbyte/s.

    Decompressed at 206.9 Mbyte/s.

    (1 MB = 1000000 byte)


    qlz110 -mem vcfiu.hlp
    Reading source file...
    Setting HIGH_PRIORITY_CLASS...
    Benchmarking...

    Compressed at 119.5 Mbyte/s.

    Decompressed at 206.9 Mbyte/s.



    qlz120 -mem scribble.wav
    Reading source file...
    Setting HIGH_PRIORITY_CLASS...
    Benchmarking, please wait...

    Compressed at 238.6 Mbyte/s.

    Decompressed at 312.6 Mbyte/s.

    (1 MB = 1000000 byte)


    qlz110 -mem scribble.wav
    Reading source file...
    Setting HIGH_PRIORITY_CLASS...
    Benchmarking...

    Compressed at 231.2 Mbyte/s.

    Decompressed at 318.9 Mbyte/s.

  22. #22
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    a bit info about tornado internals:


    > even as such, one needs rescaling, to locate the correct value, to adjust
    > the window, and to renormalize.
    >
    > how one could (even potentially) defeat the decompression times of static
    > huffman, is difficult to imagine (usually, IME, the *only* thing faster is
    > raw bytes).

    my own tests:

    Tornado
    bytecoder: decompression - 1.8-2.7 sec
    bitcoder: decompression - 2.7-3.5 sec
    hufcoder: decompression - 4.1-4.7 sec
    aricoder: decompression - 6.5-6.8 sec

    lzh:
    CAB 3.9s
    GZIP 4.6s
    RAR 7.2s

    lzari:
    7z 13.3s

    wy it is so fast? first, direct comparson is unfair. tormnado uses CABARC-like clustering of distance and length in the one slot, so it does only arith/huf decoding for each match instead of two. second, gzip uses much smaller distances, so all data copied are in cache. third, cabarc has somewhat better compression due to use of optimal match finder and this improves its decomression speed. also, all programs except for tornado and gzip implements redist/repboth which also need a bit of time to maintain. at the last end, all lzh programs uses block-static huffman compression while tornado uses semi-adaptive one

    finally, we can see that tornado hufcoder has speed close to cabarc lzh algorithm and that it about 2x faster than tornado's aricoder (i substracted 2 seconds from both timings for non-entropy coder activity). it's pure speed of semi-adaptive hufcoder vs semi-adaptive aricoder. because it's *semi-adaptive* i doesn't call rescaling too often. seraching for decoded value is very simple - i use 2048-entry table to find first possible decoded value and then run linear scan. it's rather like to decoding of huffman values via main lookup table and additional lookup tables, although i use linear scan instead of second lookup

    i also select blocks so that total=2^n that means that i execute one less division in both encoder and decoder (division by 2^n is just a shift)

    by "semi-adaptive" i mean that data are encoded with one set of counters and counted to other set at the same time. when time goes (each 24k symbols), i calculate new encoding tables according to stats of last data and use them for the following data. my experiments showed that these decrease compresion ratio by 1-2% compared to static approach, although i'm not 100% sure that i calculated this correctly. but what is much more important for me is that implementation of semi-adaptive approach is much smaller and much more readable than for block-static zip's approach. the whole implementation of aricoder or hufcoder is ~200 lines long, 10x smaller than zip's trees.cpp

    i also implemented much simpler and probably much faster variant of huffman_tree_build procedure compared to zip. instead of using priority queue i just maintain two lists - original symbols and combined nodes. both lists are sorted by frequency - first by calling qsort() at the start and second just because each new node constructed has larger counter than any previously constructed nodes. so, finding two nodes with smallest counters is trivial... i found this idea 12 years earlier, but doesn't have useful possiblity to implement it before

    i still polish tornado 0.2 but hope to release it (with sources) in a week or two. it's very close in speed/compression to thor 0.95 and significantly outperforms zip while using more memory

  23. #23
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    Would you try to sell your fast compression technology to MS? Their compression in NTFS is awful

  24. #24
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Tornado (-c2) easily compresses Werner's test.txt file to 18 bytes.

Similar Threads

  1. another (too) fast compressor
    By Cyan in forum Data Compression
    Replies: 139
    Last Post: 6th February 2016, 21:41
  2. PACKET v.0.01 new fast compressor !
    By Nania Francesco in forum Data Compression
    Replies: 45
    Last Post: 19th June 2008, 02:44
  3. RINGS Fast Bit Compressor.
    By Nania Francesco in forum Forum Archive
    Replies: 115
    Last Post: 26th April 2008, 22:58
  4. CMM fast context mixing compressor
    By toffer in forum Forum Archive
    Replies: 171
    Last Post: 24th April 2008, 14:57
  5. Fast PPMII+VC Compressor
    By in forum Forum Archive
    Replies: 4
    Last Post: 2nd August 2006, 20:17

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •