Page 1 of 4 123 ... LastLast
Results 1 to 30 of 110

Thread: The return of THOR!

  1. #1
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    From the MC guestbook...

    Quote Originally Posted by Oscar Garcia
    Greetings! You can download and test THOR v0.94 here:

    http://rapidshare.com/files/27227218/THOR_094.zip.html

    Happy testing!

    "For those about to rock..."
    Mirror #1: THOR_094.zip (68 KB)

    Mirror #2: Download THOR

  2. #2
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,511
    Thanks
    746
    Thanked 668 Times in 361 Posts
    2x faster with the same compression level as tornado

  3. #3
    Member
    Join Date
    Apr 2007
    Posts
    12
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Compression level isn't top notch, but it's fast as hell !

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    Bulat, I think you should request the source of THOR! Since the author appears...

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    Yepp, THOR was written using Delphi 6.0 - 7.0. What if I'll rewrite LZPM using Delphi??

  6. #6
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by encode
    What if Ill rewrite LZPM using Delphi??
    Reminds me of times when I had even less experience than now and thought, that console applications are written in Pascal and GUI in C++

  7. #7
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,511
    Thanks
    746
    Thanked 668 Times in 361 Posts
    encode
    the main source of slowness nowadays is amount of cache misses. i know every cache miss for my algorithm and there is no difference will it be written in delphi, asm or ? (neverteless, i can say that delphi has much worser optimization than any modern c++ compiler)

    lzp compressors, even as written 10 years ago, are good performers. probably, by adding new ideas, LZP method allows much better speed/compression than LZ77. and i flatter myself with hope that new thor was improved using tor or quad ideas

    i don't see much possibilities to improve tor in fast modes (-1 to -4), so i should say that THOR is definitely the best. my program has its advantages but main goal - outperform thor using my huge lz77 experience - was not reached. it will be interesting to try LZP too, but i think that it will require much more time (i'm pretty ignorant in this area) and not so important for freearc users - even with 2x less speed than thor, tor algorithm will be enough for fast freearc modes

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    Yes, but it is worth to try! Probably I'll get the better I/O performance.

  9. #9
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,511
    Thanks
    746
    Thanked 668 Times in 361 Posts
    about i/o - there are few std ideas that you may use

    1) don't use fgetc/fputc. make your own buffer instead:

    inline putc(c)
    {
    *curptr++ = c;
    if curptr >= bufend
    -- write
    }

    2. if you really need fast i/o - use 3 threads - for read, compress and write. read/write data in large chunks. use circular buffers. preread large enough amount of data (for example, for quad what compress data in 16mb blocks, you should use 4*16mb preread buffers and at least 2*16mb output buffers)

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    It's well known thing...

    I must admit that Pascal/Delphi has some sort of magic. It's easy. It has all functions for easy app development. Easy to read, easy to understand...

  11. #11
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,475
    Thanks
    26
    Thanked 121 Times in 95 Posts
    have you tried using win32's createfile api with asynchronous mode, sequential scan flag set and disabled internal (by system) file buffering?

    i think it should be faster, because asynchronous mode allows you to perform i/ o operations and (de) compress files at the same time.

  12. #12
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,511
    Thanks
    746
    Thanked 668 Times in 361 Posts
    it's the same as using separate threads for i/o. in haskell, my way is simpler (and i tested that it really allows 100% overlapping of i/o and compression), in C your way may be simpler

    the only difference is that i don't disable buffering. my way is already implemented in freearc, you can try it using fast mode (say, -m2) and large amount of files

  13. #13
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,511
    Thanks
    746
    Thanked 668 Times in 361 Posts
    Quote Originally Posted by encode
    I must admit that Pascal/Delphi has some sort of magic. Its easy. It has all functions for easy app development. Easy to read, easy to understand...
    yes, its great for application programs development (see juliet-prg.narod.ru), but doesnt give any automatic speed improvements. our algorithms is the problematic part, not our compilers/laguages

  14. #14
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Thanks Oscar!

    I invite you to join us here at the forum.

  15. #15
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by LovePimple
    I invite you to join us here at the forum.
    Thaks LovePimple!

    Quote Originally Posted by encode
    I must admit that Pascal/Delphi has some sort of magic
    Youre right, encode. Youve got quite a nice site here.

    Quote Originally Posted by Bulat Ziganshin
    i dont see much possibilities to improve tor in fast modes (-1 to -4), so i should say that THOR is definitely the best. my program has its advantages but main goal - outperform thor using my huge lz77 experience - was not reached. it will be interesting to try LZP too, but i think that it will require much more time (im pretty ignorant in this area) and not so important for freearc users - even with 2x less speed than thor, tor algorithm will be enough for fast freearc modes
    Your words honour you, Bulat. But dont give up, we must stress those new Opterons to the limit

  16. #16
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Good to have you on board Oscar! Excellent work with the latest version of THOR!




    Could you please explain what changes you have made since the previous version?

  17. #17
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,511
    Thanks
    746
    Thanked 668 Times in 361 Posts
    Oscar, i'm glad to see you here

    about tornado - i work on next version, but it has minor improvements in fast modes. unfortunately, for good program one need a large amount of time, even if he has good potential. i developed tor mainly for freearc and 7zip and Igor said that he don't want to use it - so only for my own archiver

    i have already developed and adapted several small algorithms to improve text, wave, executable compression in freearc (www.haskell.org/bz), tornado become one more algorithm - for fast mode. i can spend 2-4 weeks on it and then prefer to go into polishing other aspects of archiver while for you thor is only (compression) algorithm

    if some fast lzh compressor will be available in sources, i will prefer to use it and don't reinvent the wheel. if you will make thor sources available on lgpl license, i will probably reimplement them in C++ and include in my program. otherwise, thor will remain best program in this area, while tor will remain only one available with souces

  18. #18
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    I notice that you still havent fixed the bug that was reported several months ago by Mark Cramer.

    Quote Originally Posted by Mark Cramer
    For Oscar Garcia,

    Thor 0.93a seems to have minor trouble with files over 2G, the file is 2,228,682,752 bytes

    -=[ THOR ]=- v0.93 alpha Oscar Garcia
    FILES: 1
    SIZE: -2066284544 bytes
    COMPRESSED: 1049189720 bytes
    RATIO: UNDEFINED
    TIME: 431sec 250msec
    SPEED: UNDEFINED

    Just FYI

    Mark
    My test...


    -=[ THOR ]=- v0.94 alpha Oscar Garcia


    WARNING! Evaluation version. May contain bugs. Use only for TESTING.

    - Compressing... 100%%
    - Done.

    FILES: 1
    SIZE: -1884254208 bytes
    COMPRESSED: 1366538540 bytes
    RATIO: UNDEFINED
    TIME: 122sec 734msec
    SPEED: UNDEFINED

    Kernel Time = 15.031 = 00:00:15.031 = 12%
    User Time = 38.843 = 00:00:38.843 = 31%
    Process Time = 53.875 = 00:00:53.875 = 43%
    Global Time = 122.828 = 00:02:02.828 = 100%


    The size of the test file was only 2.24 GB (2,410,713,088 bytes).

    Any chance you can fix this bug for the next version?

  19. #19
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,511
    Thanks
    746
    Thanked 668 Times in 361 Posts
    LovePimple
    read:
    Quote Originally Posted by LovePimple
    May contain bugs

  20. #20
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Does contain a bug!

  21. #21
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    It's not a bug. It's an integer limitation...

  22. #22
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by encode
    Its an integer limitation
    Well, delphi uses type LongWord, which is unsigned integer, so the limitation could be 4GB then... And then there is Int64...

  23. #23
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    Generic integer types for 32-bit implementations of Delphi
    Type Range Format
    Integer -2147483648..2147483647 signed 32-bit
    Cardinal 0..4294967295 unsigned 32-bit
    Fundamental integer types include Shortint, Smallint, Longint, Int64, Byte, Word, and Longword.

    Fundamental integer types
    Type Range Format
    Shortint -128..127 signed 8-bit
    Smallint -32768..32767 signed 16-bit
    Longint -2147483648..2147483647 signed 32-bit
    Int64 -2^63..2^63-1 signed 64-bit
    Byte 0..255 unsigned 8-bit
    Word 0..65535 unsigned 16-bit
    Longword 0..4294967295 unsigned 32-bit

    Most likely he uses Integer.

    Or this is just writeln bug!

  24. #24
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Yup, I use cardinal and today I discovered longword... learn something every day

  25. #25
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    thor benchmarked (also tornado, quicklz, lzpm, m99).
    http://cs.fit.edu/~mmahoney/compression/text.html

  26. #26
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,511
    Thanks
    746
    Thanked 668 Times in 361 Posts
    Quote Originally Posted by Matt Mahoney
    thor benchmarked (also tornado, quicklz, lzpm, m99).
    http://cs.fit.edu/~mmahoney/compression/text.html
    actually, quicklz-0 should be fastest compressor in your test. tornado has -1 to -12 predefined options and thor 0.94 works better than previous version so it has meaning to retest ef/e/ex too

  27. #27
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi all,

    Quote Originally Posted by LovePimple
    Any chance you can fix this bug for the next version?
    This one should handle huge files (>2GB):
    http://rapidshare.com/files/27449609/THOR_094.zip. html

    Quote Originally Posted by encode
    Most likely he uses Integer
    It was longint

  28. #28
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by Matt Mahoney
    thor benchmarked (also tornado, quicklz, lzpm, m99).
    Thanks, Matt

    Quote Originally Posted by Oscar
    This one should handle huge files (>2GB):
    Thank you!

  29. #29
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Oscar
    This one should handle huge files (>2GB):
    http://rapidshare.com/files/27449609/THOR_094.zip. html
    Thanks Oscar! Works like a charm!

    Mirror: Download THOR

  30. #30
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    By the way, I have my own fast LZ! It compresses enwik8 within 2 sec! (looks like I/O bounded). If anyone will be interested I can release it.

    It uses:
    + My own LZP
    + Byte aligned I/O
    + Has Optimal Parsing


Page 1 of 4 123 ... LastLast

Similar Threads

  1. Looking for Thor's author Oscar Garcia
    By ovatsus in forum Data Compression
    Replies: 7
    Last Post: 8th May 2009, 13:02
  2. Hook 1.4 , ADMC compression return!
    By Nania Francesco in forum Data Compression
    Replies: 9
    Last Post: 4th May 2009, 22:44
  3. The return of MaximumCompression :-)
    By Vacon in forum Forum Archive
    Replies: 2
    Last Post: 27th January 2008, 17:00
  4. Thor 0.93a
    By in forum Forum Archive
    Replies: 2
    Last Post: 16th June 2006, 22:01

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •