Page 1 of 3 123 LastLast
Results 1 to 30 of 62

Thread: balz v1.00 - new LZ77 encoder is here!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Started writing a byte-wise LZ77 and adding an arithmetic compression, I came up to the new QUANTUM-like encoder:
    + Baseline LZ77 with 1 MB dictionary
    + Arithmetic encoding
    + Modified Storer&Szymanski parsing scheme
    + EXE transformer
    Whatsoever, this new shit provides higher compression on binary files than LZPM, at the same time being faster at decompression.

    Stop talking, check this out:
    balz100.zip (46 KB)

    But note, this peace can be VERY slow, especially on files like FP.LOG and on text files like ENWIK8/ENWIK9. So, test it on new machines like Core 2 Duo, you know what I'm saying...

    Enjoy anyway!


  2. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Just found one mistake in source, new version will be released ASAP, be patient!

  3. #3
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    OK, here we are:
    balz101.zip (46 KB)


  4. #4
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Thanks Ilia!

    Quote Originally Posted by encode
    But note, this peace can be VERY slow, especially on files like FP.LOG and on text files like ENWIK8/ENWIK9. So, test it on new machines like Core 2 Duo, you know what Im saying...
    I may have to wait for members with powerful (expensive) machines to test this one though!

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by LovePimple
    I may have to wait for members with poweful (expensive) machines to test this one though!
    If youre testing PAQ series on your machine - no worries!

  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    The thing with v1.00 was some "hidden" mistake - the coder just drops too many small matches. I'm afraid that all versions of LZPM have such bug too. Anyway, I believe that with an additional proper BALZ tuning we may get another compression gain - since all tunings was done on broken scheme. Will release another version after careful testing...

  7. #7
    Member
    Join Date
    Jan 2008
    Posts
    33
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi , encode

    Test of enwik8

    My machine is Intel Core2 Duo Q6600 (Quad core) 2.4Ghz + DDR800 2GB RAM , Windows Vista 32bit

    AcuTimer v1.2
    Copyright (c) 2007 by LovePimple

    balz v1.01 by encode
    optimizing 16384k block...
    optimizing 16384k block...
    optimizing 16384k block...
    optimizing 16384k block...
    optimizing 16384k block...
    optimizing 15736k block...
    done

    Elapsed Time: 00 00:35:56.813 (2156.813 Seconds)

    final size---> 29,881,549 bytes

    Okay, it used about 25~30% cpu load.
    Well, the cost time is not accurate because I was using eMule during compression.

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Try to test it on binary files and TARs with lots of stuff.

  9. #9
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    In addition, I will try to change the string searching method. Note that BALZ displays progress of string searching - actual parsing begins when compressor shows 100%. Like you see, parsing and coding takes no time, compared to the full string search...

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    No extra testing results so far...

  11. #11
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 153 Times in 18 Posts
    Quote Originally Posted by encode
    No extra testing results so far...
    Here you go:
    <div class=""jscript""><pre> 843.694 A10.balz
    1.410.320 AcroRd32.balz
    937.509 english.balz
    3.733.912 FlashMX.balz
    866.344 FP.balz
    1.869.867 MSO97.balz
    835.558 ohs.balz
    1.035.397 rafale.balz
    684.231 vcfiu.balz
    612.546 world95.balz

    -> 12.829.378 Bytes[/code]

    I have not timed compression, but decompression speed is really pretty!

  12. #12
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Well, running both BALZ and RZM on singlecore (and not very powerful) machine at the same time is sure fun

    Due to not having test machine at hand only sizes without speeds follow.Thanks for release

    14 228 209 - BALZ 1.01
    13 318 039 - LZPM 0.15

  13. #13
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,375
    Thanks
    214
    Thanked 1,023 Times in 544 Posts
    > + Baseline LZ77 with 1 MB dictionary
    > + Arithmetic encoding

    Are you including EXE transformation choices into your optimization?
    Or do you simply relocate 32bit suffixes of E8/E9 if there was a MZ header?
    Well, I mean, that relocating or not is the same choice as with matches...

  14. #14
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    EXE transformer works separately - the main encoder even not knows that actual transformation was done.
    Transformer searches for 32-bit PE magic number (0x4550) in block and applies E8/E9 transformation after that point.


  15. #15
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quick test...

    Test machine: Intel PIII (Coppermine) @750 MHz, 512 MB RAM, Windows 2000 Pro SP4

    Test File: VALLEY.CMB (19,776,230 bytes)

    Timed with AcuTimer v1.2

    Compression
    Compressed Size: 8,880,490 bytes
    Elapsed Time: 00:18:49.767 (1129.767 Seconds)

    Decompression
    Elapsed Time: 00:00:06.514 (6.514 Seconds)

  16. #16
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Thanks!

  17. #17
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Looks like current version is OK in terms of carefully chosen parameters...

    Made some experiments with version with lazy matching. Quite impressed by performance - compression speed is awesome, compression ratio is nice, but not so interesting as with SS parsing.

    Anyway, looking forward for official benchmark results - MFC, Squeeze Chart, Black Fox's Benchmark, MOC, etc...


  18. #18
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    The result on ENWIK9 is:
    261,645,091 bytes

    The compression took about four hours on my Core 2 Duo...

    Well, I will release a version with Lazy Matching strategy. With some parsing tricks the compression ratio will be not that much worser, at the same time compression will be FAST!


  19. #19
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    OK, the final "quick" release:
    balz102.zip (46 KB)

    This one is notable faster and has 512K window.

  20. #20
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Thank you, has been tested as well
    As window was halved, it's compressing twice as fast, also decompressing a little slower. Not much of performance change ratio-wise, of course except for pht/PSD, which is about 2,3MB bigger...

  21. #21
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Thanks Ilia

  22. #22
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by Black_Fox
    As window was halved, its compressing twice as fast,
    On some files like world95.txt, fp.log, ENWIKs the compression is 4X-8X faster.

    Quote Originally Posted by Black_Fox
    also decompressing a little slower.
    In certain cases the decompression can be even faster...


  23. #23
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quick test...

    A10.jpg > 843,681
    AcroRd32.exe > 1,416,885
    english.dic > 937,379
    FlashMX.pdf > 3,740,800
    FP.LOG > 896,743
    MSO97.DLL > 1,893,199
    ohs.doc > 838,427
    rafale.bmp > 1,037,574
    vcfiu.hlp > 685,817
    world95.txt > 634,975

    Total = 12,925,480 bytes

  24. #24
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Thanks!

  25. #25
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Result MOC Test:
    160.113.645
    COMP TIME = 2.130,542 sec.
    DEC. TIME=26,476 sec.
    Hi Encode!

  26. #26
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    You just retested the previous version! (Same compression results as with v1.01!)

  27. #27
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Another quick test...

    Test machine: Intel PIII (Coppermine) @750 MHz, 512 MB RAM, Windows 2000 Pro SP4

    Test File: ENWIK8 (100,000,000 bytes)

    Timed with AcuTimer v1.2

    Compression
    Compressed Size: 5,177,344 bytes
    Elapsed Time: 00:36:40.436 (2200.436 Seconds)

    Decompression
    Elapsed Time: 00:00:20.729 (20.729 Seconds)

    Decompressed file is longer (100,000,121 bytes) than the original.

  28. #28
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Excuse Encode! Had copied Balz 1.02 in a different briefcase from that of the test! Ok I have tested ! Good improvements in the speed! be stable!
    MOC Test: 161.766.723
    Comp time=818,054 sec.
    Dec. time=25,683 sec.
    Hi !

  29. #29
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Intel Core duo 2 E6600
    Enwik8 : 30.634.726 b
    Comp time= (397,994 Seconds) Acutimer
    Dec time= (7,061 Seconds) Acutimer

  30. #30
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by LovePimple
    Decompressed file is longer (100,000,121 bytes) than the original.
    Are you sure?? On my both PC and laptop decompression went OK!

    Can anyone confirm the bug?

Page 1 of 3 123 LastLast

Similar Threads

  1. PPMX - a new PPM encoder
    By encode in forum Data Compression
    Replies: 14
    Last Post: 30th November 2008, 17:03
  2. about files to test encoder
    By Krzysiek in forum Data Compression
    Replies: 3
    Last Post: 9th July 2008, 22:22
  3. fcm1 - open source order-1 cm encoder
    By encode in forum Data Compression
    Replies: 34
    Last Post: 5th June 2008, 00:16
  4. LZ77 speed optimization, 2 mem accesses per "round"
    By Lasse Reinhold in forum Forum Archive
    Replies: 4
    Last Post: 11th June 2007, 22:53
  5. Fast arithcoder for compression of LZ77 output
    By Bulat Ziganshin in forum Forum Archive
    Replies: 13
    Last Post: 15th April 2007, 18:40

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •