Results 1 to 18 of 18

Thread: BALZ v1.08 is here!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts

    PIM Logo BALZ v1.08 is here!

    This is HEAVY!

    New version introduces:
    • A larger ROLZ model (192 MB)
    • And as the main part, brand new CM encoder for LZ-output coding
    New CM is unique and crazy - this thing makes the difference indeed - check for yourself...

    Will wait for results of this HEAVY DUTY release...

    http://encode.su/balz/index.htm


  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    E2160 @ 9x360=3.24, DDR2-800 5-5-5-18 @ 900
    http://shelwien.googlepages.com/balz108.htm

    Just switch to PPM already - its become slow enough

    Also I kinda expected more significant improvement after your posts.
    But guess thats what you can get from literals.

    And I wonder what happens with wcc386...

  3. #3
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    I have an idea about pure CM. However, I think still BALZ will be faster at decompression than CM or high-end PPM. Will look at it's compression performance overall. Having said than this LZ-layer makes the program far more complex compared to pure CM/PPM. Anyway, to outperform current BALZ we need something serious...

  4. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    1. CM traditionally mixes several models and PPM works with a single one (at specified order).
    And if you still care about speed, then PPM is obvious choice.
    2. Even low-end PPM, like PPMd or faster, would have better compression
    at the speed of balz 1.08 decoding.

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Shelwien View Post
    2. Even low-end PPM, like PPMd or faster, would have better compression
    at the speed of balz 1.08 decoding.
    Binaries to compare in studio!

  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    The next step will be stronger CM+SSE/APM. Just tested BALZ with such configuration. Compression speed is the same, but compression ratio is notable higher and brings BALZ to definitely new level. Now I will experimenting with SSE/APM, to find out the fastest one.

    Shelwien, can you recommend something? APM a la PAQ7/9?

    Mostly, since PAQ6 nothing really changed:

    p = ((pr * 3) + ssep) / 4

    ssecontext = (cxt << 2) + (c >> 6)


  7. #7
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    http://shelwien.googlepages.com/ppmdj1.htm

    So I still think that even getting PPMd to run at the speed of BALZ 1.08
    decoding is pretty realistic:
    1. Further compiler tweaking is possible (like profile-driven compilation etc)
    2. RC is far from being effective (and more than 5 years old anyway):
    2.1. There's a division!
    2.2. That's a carryless rc with complex conditions - its especially bad for decoding and also significantly redundant
    2.3. There's unary coding with up to MtF rank iterations per byte
    3. Model is old:
    3.1. There're linear byte counters with rescaling - slow and ineffective
    3.2. There's no SSE, although its the best technique for fast context models
    4. Also there's getc/putc i/o.

    But better idea would be probably to write something new,
    for example based off fpaq0f2, and with all the new tricks.

  8. #8
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts

    Tested BALZ on MOC

    BALZ 1.08 option EX is Added to MOC!

  9. #9
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Nania Francesco View Post
    BALZ 1.08 option EX is Added to MOC!
    Thank you!

  10. #10
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    You can look at my SSE at http://shelwien.googlepages.com/order_test5.rar
    Its two-dimensional there though, but that could be easily
    cut down into simple SSE and optimized really good.
    So I'd recommend using something along that line, and my
    counters too (though there's much space for speed
    optimizations).

    Of course, my approach seems slower than Matt's due to interpolation and
    other multiplications etc, and larger counters.
    But then, from my experience it seems that a simple model
    with these "slow" elements commonly compresses better than a
    complex one implemented with "fast" elements.
    And faster too, because of that simplicity.

    Actually, as already mentioned, I recently use paq8 results
    as an optimization threshold.
    And there wasn't even a single case where it seemed hard to
    beat paq8 - of course, on my specific data, but even if I
    format it really conveniently for paq8.
    It seems that a common scheme sufficient for winning over
    paq8 on custom data is unary/binary decomposition with SSE.
    Most probable symbols/values are checked first with separate
    and optimized counter and SSE contexts, and then other
    values are encoded bit by bit, with SSE or not depending on how
    rare they are.
    Of course, further improvements are possible by constructing submodel
    versions with different contexts and merging their predictions with
    a mixer or SSE2, but that would probably already be too slow for you.

    And as to paq's APM, I think that's a different thing from my SSE.
    But you just have to understand what's it about and use whatever
    structures you'd consider appropriate.
    Just in case, I'd repeat: SSE significantly improves compression only
    if used for context clustering, not to fix broken counters like in paq.
    Last edited by Shelwien; 20th May 2008 at 21:16.

  11. #11
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Thanks! Will try to do something...
    Also, maybe I will not rush with APM and will use a stronger counters first. I just explored some interesting counters...

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    BTW, next version of BALZ will be released VERY soon (within 1-2 days). I just invented new counter for CM model. So, you may not even start testing it seriously...

  13. #13
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by encode View Post
    This is HEAVY!

    New version introduces:
    • A larger ROLZ model (192 MB)
    • And as the main part, brand new CM encoder for LZ-output coding
    New CM is unique and crazy - this thing makes the difference indeed - check for yourself...

    Will wait for results of this HEAVY DUTY release...

    http://encode.su/balz/index.htm

    Thanks Ilia!

    Mirror: Download

    Looking forward to the next release!

  14. #14
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts

  15. #15
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Thanks a lot!

  16. #16
    Member
    Join Date
    May 2008
    Location
    USA
    Posts
    44
    Thanks
    0
    Thanked 3 Times in 3 Posts
    Quote Originally Posted by encode View Post
    Thanks a lot!
    Not so fast:

    Code:
    Compressor   Opt     enwik8      enwik9         Prog      Total       Comp Decomp  Mem Alg
    ---------    ---   ---------   -----------     -------  -----------   ----  ----   --- ----
    balz 1.02          30,634,726  268,552,062     48,030 x  268,600,092  21804    58  346 LZ77
    balz 1.06    e     28,674,640                                          1580    79   67 ROLZ
    balz 1.06    ex    28,234,913  245,288,229     48,937 x  245,337,166   2440    75   67 ROLZ
    balz 1.07    e     28,271,200                                          1060    96  132 ROLZ
    balz 1.07    ex    27,416,245  237,492,151     49,082 x  237,541,233   2106    77  132 ROLZ
    balz 1.08    ex    26,534,890  229,477,116     49,351 x  229,526,467   4431   126  200 ROLZ
    Balz' decompression is twice as slow in 1.08?

    One of the reason balz is so exciting, to me, is the decompresson speed... are you now heading towards tighter compression at the expense of decomp speed?

  17. #17
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Trixter View Post
    One of the reason balz is so exciting, to me, is the decompresson speed... are you now heading towards tighter compression at the expense of decomp speed?
    1.08 is slower due to a larger ROLZ model, plus a stronger LZ-output encoder. Well, I'm trying to increase compression, even at some cost of a decompression speed... Without strong compression, BALZ makes no sense... In other words, not so interesting for the most audience.

  18. #18
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    There's actually no reason to write a completely new compressor, if the goal
    is just to get better performance than some known compressor, eg. 7-zip.
    In such cases you just take the LZMA source and optimize it until you reach your goal -
    original developers rarely do full optimization.
    And even if there're no sources, its always possible to reverse-engineer the algorithm
    and optimize it after that - an interesting topic too.

    Well, what I wanted to say is that for now the easiest way to get better than some
    specific target is just a matter of code optimization - until somebody does that first.

    On the other hand, its still mostly unknown how to build a good model for given data,
    so imho working in this area is much more creative.

Similar Threads

  1. BALZ v1.12 is here!
    By encode in forum Data Compression
    Replies: 23
    Last Post: 10th June 2008, 16:02
  2. BALZ v1.11 is here!
    By encode in forum Data Compression
    Replies: 16
    Last Post: 30th May 2008, 16:48
  3. BALZ v1.05 is here!
    By encode in forum Data Compression
    Replies: 6
    Last Post: 8th May 2008, 23:34
  4. balz v1.04 is here!
    By encode in forum Forum Archive
    Replies: 28
    Last Post: 1st May 2008, 22:41
  5. balz v1.03 is here!
    By encode in forum Forum Archive
    Replies: 43
    Last Post: 24th April 2008, 14:53

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •