Page 1 of 2 12 LastLast
Results 1 to 30 of 34

Thread: TC 5.0dev8 released!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    It's not really what I meant... But in this version I improve the LZP-layer, as a result higher compression, especially on text files.

    Enjoy!

    Link:
    Download TC 5.0dev8 (30 KB)

  2. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    TC 5.0dev8 on Large Text Compression Benchmark:

    ENWIK8: 27,801,253 bytes
    ENWIK9: 246,923,158 bytes (c 376 sec, d 415 sec)

    Memory usage: 24 MB

    P4 3.0 GHz, 1 GB RAM, Windows XP SP2


  3. #3
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Well, the speed is affected. I think it's due to the cache misses - do you remember my low memory LZP index table implementation? By now, reference to it is doubles.


  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    TC 5.0dev8 on Calgary Corpus:

    bib: 28,647 bytes
    book1: 247,931 bytes
    book2: 167,099 bytes
    geo: 61,858 bytes
    news: 117,870 bytes
    obj1: 10,510 bytes
    obj2: 76,037 bytes
    paper1: 17,068 bytes
    paper2: 26,336 bytes
    pic: 52,800 bytes
    progc: 12,786 bytes
    progl: 15,117 bytes
    progp: 10,626 bytes
    trans: 16,623 bytes

    total: 861,308 bytes (2.1932 bpb)


  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    TC 5.0dev8 on SFC:

    A10.jpg: 853,308 bytes
    acrord32.exe: 1,703,280 bytes
    english.dic: 928,452 bytes
    FlashMX.pdf: 3,761,712 bytes
    fp.log: 628,256 bytes
    mso97.dll: 2,068,751 bytes
    ohs.doc: 854,993 bytes
    rafale.bmp: 1,033,156 bytes
    vcfiu.hlp: 709,955 bytes
    world95.txt: 586,745 bytes [!]

    total: 13,128,608 bytes


  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Okay, a few notes about testing results. Generally, the compression is improved, but on small files compression is about the same and on some files like 'english.dic' compression is ruined. But now TC compresses 'world95.txt' to 572 KB, it's good. Also note, on ALL my testing files compression is improved, sometimes smaller, sometimes bigger. The speed is affected but not too much - it completely depends on files and PC - since we must have a fast access to the memory.


  7. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    By the point, TC is stands for Turbo Compressor.


  8. #8
    Guest
    Just downloaded it. Thanks!

  9. #9
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Performance on 'bible.txt', 4,047,392 bytes

    TC 5.0dev8: 892,651 bytes
    LZPXJ 1.2a, -m3: 959,008 bytes
    UHARC 0.6b, -mz: 1,002,070 bytes
    LZPX 1.5b: 1,117,790 bytes
    PKZIP 2.50, -exx: 1,172,728 bytes


  10. #10
    Guest
    Great results!

    Be interesting to see how this version performs on the MFC test.

  11. #11
    Guest
    TC 5.0dev8 fails to correctly decompress the MFC test set (output size is three times the original input size).

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    I guess, I know what's the problem. For the first time, I compile this version with different compiler options, enabling extra optimizations such as 'loop-unrolling' and many more. Due to some incompatibility with Hardware/Software this can provide such issue. Why I think so:
    + Each version I release I test on a gigabytes of data. On my PC no issues found.
    + Version 5.0dev8 differences from 5.0dev7 insignificantly - only LZP index table was changed. And this table cannot provide such bug.
    + And finally, output size CANNOT be larger than original, uncompressed data. It's mandatory for this algorithm - even if this algorithm have serious bugs or compressed data is corrupted, output size must be as original. But I think, 'loop-unrollng' or something like that in some cases and with different Software/Hardware can provide this issue.

    So, this evening I recompile this version with standard options, as always. Anyway, you'll be informed.


  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Okay, I recompile TC 5.0dev8 using standard options. Just redownload the file! The CRC32 of this new EXE must be: A04461D6

    Download TC 5.0dev8 (Recompiled) (30 KB)

  14. #14
    Guest
    Thanks!

  15. #15
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    By the way, read some article on TC (Russian):
    http://forum.compression.ru/viewtopic.php?p=3224#3 224

    In this article I explain the TC algorithm, basic principles and implementation details.


  16. #16
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts

  17. #17
    Guest
    That photo looks a bit like me!

  18. #18
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Looks like recompiled TC is clean! Well, do you see how important the compiler options... The source code was completely untouched, just different compiler settings - that's why the version number was unchanged.


  19. #19
    Guest
    It shows no problem on my PC with 911MB Tar file.

  20. #20
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Future plans:
    + Add EXE-Filter. This filter gives serious compression improvement on EXE/DLL files. Firstly, this can noticeable improve compression on SFC test kit - since improvement can be about 200 KB on each of two executable files in this benchmark. But this is not a target. I think small filter not hurts, but I will not do a lots of filters and transform stages, as with PIMPLE, since after that this compressor becomes like a bundle-monster.
    + Add CRC32 checking. This is a must have feature. In addition, this feature will help to detect any engine issues, and ensures correct decompression.

    Well, looks like the work on base engine is over. I will continue experimenting, and will listen to other data compression programers for improvement suggestions. But at this moment, configuration of encoder is the best - in all terms - compression, speed and memory usage.

    My kung-fu is the best!

  21. #21
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    I have a few ideas for the base-engine improvement, though. Like coding quantized match lengths.

  22. #22
    Guest
    What algorithm uses QazaR 0.0 pre5?
    world95.txt: 586,745 tc
    world95.txt: 567,108 qazar

  23. #23
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    What algorithm uses QazaR 0.0 pre5?

    QAZAR uses modified LZRW4 algorithm. The author calls it just LZP - but this is not true.

    Also note with this options (-x7 -l7) memory usage grows linearly and can achieve really large numbers. At the same time, TC always uses 24 MB.

    And finally, in fact, QAZAR is really slower than TC!


  24. #24
    Guest
    I'm well aware that TC superior to QAZAR and many others.

  25. #25
    Guest
    Werner has now confirmed that TCdev8 is now decompressing without problems.

    TC Rocks!

  26. #26
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    It's interesting, but most of newbies authors learned from LZPX/LZPXJ since they are OpenSource. And QAZAR is not exclusion from the rules - do you remember the first versions? Werner ask about QAZAR when - is it LZPX clone or something? So, do not forget about the roots!

  27. #27
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Performance on 'calgary.tar', 3,152,896 bytes

    TC 5.0dev8: 868,969 bytes
    LZPXJ 1.2a, -m3: 890,503 bytes
    UHARC 0.6b, -mz: 903,649 bytes
    QAZAR 0.0pre5: 911,599 bytes
    LZPX 1.5b: 982,999 bytes
    PKZIP 2.50, -exx: 1,017,863 bytes


  28. #28
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Performance on 'fp.log', 20,617,071 bytes

    TC 5.0dev8: 628,256 bytes
    QAZAR 0.0pre5: 701,008 bytes
    UHARC 0.6b, -mz: 767,290 bytes
    LZPXJ 1.2a, -m3: 794,270 bytes
    LZPX 1.5b: 896,371 bytes
    PKZIP 2.50, -exx: 1,331,724 bytes


  29. #29
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    By the point, QAZAR 0.0pre5, without switches tuning for specific file, compresses 'world95.txt' to 643,721 bytes. Meanwhile without any options TC 5.0dev8 compresses it to 586,745 bytes.

    Conclusions:
    Do not believe to SFC results too much! Since usual user will not try to find the best switches combination - it's insanity!

  30. #30
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Also I have idea to create a GUI program based on TC compression engine. Maybe it will be TC 5.1 or brand new name. Features:
    + Three compression modes: Fast, Normal, Max
    + CRC32 checking
    + EXE-filter
    + It still faster than RAR and MUCH faster than 7-Zip, but provides higher compression that ZIP. (I think, unsuccess of PIMPLE is due to its speed.)
    + Simple PIMPLE-like GUI


Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •