Results 1 to 17 of 17

Thread: BALZ v1.11 is here!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts

    Exclamation BALZ v1.11 is here!

    Triple one = triple strike!
    OK, what's new:
    + Improved/optimized mixer
    + Optimized Match Finder
    + Many general optimizations, some parts of code was completely rewritten

    As a result, I returned to 19 sec. compression time for fp.log on my machine. On some files like ENWIKs compression speed was crazily gained... Overall, new BALZ became notable faster with a slightly better compression.

    This is serious!

    http://encode.su/balz/index.htm


  2. #2
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Ilia!

    Mirror: Download

  3. #3
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    395
    Thanks
    148
    Thanked 225 Times in 123 Posts
    timer.exe balz.exe ex fp.log fp.logex
    Kernel Time = 0.140 = 00:00:00.140 = 0%
    User Time = 33.562 = 00:00:33.562 = 99%
    Process Time = 33.703 = 00:00:33.703 = 99%
    Global Time = 33.813 = 00:00:33.813 = 100%

    timer.exe balz.exe e fp.log fp.loge
    Kernel Time = 0.156 = 00:00:00.156 = 4%
    User Time = 3.453 = 00:00:03.453 = 95%
    Process Time = 3.609 = 00:00:03.609 = 100%
    Global Time = 3.609 = 00:00:03.609 = 100%


    timer.exe balz.exe d fp.logex fp.logex1
    Kernel Time = 0.031 = 00:00:00.031 = 3%
    User Time = 0.406 = 00:00:00.406 = 44%
    Process Time = 0.437 = 00:00:00.437 = 48%
    Global Time = 0.906 = 00:00:00.906 = 100%

    timer.exe balz.exe d fp.loge fp.loge1
    Kernel Time = 0.046 = 00:00:00.046 = 4%
    User Time = 0.531 = 00:00:00.531 = 50%
    Process Time = 0.578 = 00:00:00.578 = 55%
    Global Time = 1.046 = 00:00:01.046 = 100%



    timer.exe balz.exe ex enwik8 enwik8ex
    Kernel Time = 1.031 = 00:00:01.031 = 0%
    User Time = 248.390 = 00:04:08.390 = 98%
    Process Time = 249.421 = 00:04:09.421 = 98%
    Global Time = 252.546 = 00:04:12.546 = 100%

    timer.exe balz.exe e enwik8 enwik8e
    Kernel Time = 0.953 = 00:00:00.953 = 0%
    User Time = 122.468 = 00:02:02.468 = 98%
    Process Time = 123.421 = 00:02:03.421 = 99%
    Global Time = 123.859 = 00:02:03.859 = 100%

    timer.exe balz.exe d enwik8ex enwik8ex1
    Kernel Time = 0.281 = 00:00:00.281 = 1%
    User Time = 14.812 = 00:00:14.812 = 84%
    Process Time = 15.093 = 00:00:15.093 = 85%
    Global Time = 17.594 = 00:00:17.594 = 100%

    timer.exe balz.exe d enwik8e enwik8e1
    Kernel Time = 0.296 = 00:00:00.296 = 1%
    User Time = 14.968 = 00:00:14.968 = 83%
    Process Time = 15.265 = 00:00:15.265 = 84%
    Global Time = 18.016 = 00:00:18.016 = 100%

    CPU: T7100 C2D@1.8
    OS: XP SP3
    KZo


  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Thanks for testing!

  5. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts

  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    bible.txt: 883,669 bytes

    calgary.tar: 837,828 bytes

    book1: 257,600 bytes

    pht.psd: 1,102,034 bytes

    valley.cmb: 8,927,499 bytes

    sfc.7z: 12,026,180 bytes

    A10.jpg: 837,755 bytes
    acrord32.exe: 1,379,561 bytes
    english.dic: 745,926 bytes
    FlashMX.pdf: 3,724,473 bytes
    fp.log: 554,522 bytes
    mso97.dll: 1,808,446 bytes
    ohs.doc: 806,945 bytes
    rafale.bmp: 982,389 bytes
    vcfiu.hlp: 640,138 bytes
    world95.txt: 556,972 bytes
    Total: 12,037,127 bytes



  7. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Shelwien View Post
    Thank you!

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Simon Berger
    Congratulations Encode it grows from version to version up to a really interesting/good compression tool . But you have to stop doing such big steps because I could worry not to beat you in the next weeks or better months ... .

    btw: Don?t let you manipulate by Shelwien with his anti-lz-based-algo-trip

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    I don't actually care about the success of my CM ad campaign, as its goal is the process itself

  10. #10
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Mhm thanks encode now I had to register myselv here also.

    I?m not with you in this statement Shelwien. The cpu power get higher and higher and maybe wen soon have the first programs using the gpu (will try something in this direction later too) but asymetric algorithms are much more interesting and currently also more powerfull at all (ratio). i don?t have to write about real time use...
    ccm showed what is possible but I think thats all in terms of speed. Lz based tools shows such a big range of ratios and every month there are new interesting ones (very fast and less compression, slow and very good compression...) and sometimes very good balanced (lzma, balz, uharc and some more).
    I know there is ever much own taste.

  11. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    > I'm not with you in this statement Shelwien.

    Which statement?

    > programs using the gpu (will try something in this direction later too)

    I'd toyed with cuda, and it seems really simple and convenient.
    So it surely would be useful for modelling, if I'd write something
    as slow as paq8 some time. Also it may be helpful for BWT sorting.
    But I don't really see how it can be used for LZ.

    > but asymetric algorithms are much more
    > interesting and currently also more powerfull at all
    > (ratio).

    I'm only advertising the use of modelling for compression (what a novel idea!),
    not any specific choice of algorithm. So I'd be happy even with CM used for
    file segmentation and automatical restructuring before LZ pass.
    (btw, Shkarin's seg_file is a simple CM too).

    Also I just don't understand why you think that CM is always symmetrical.
    Well, most known examples are (actually they have slower decompression),
    but its just a matter of laziness - supporing just a single model is
    much easier than synchronizing the asymmetrical (so different) encoder
    and decoder.

    > i don't have to write about real time use...

    That's where LZ loses to CM, btw. Because LZ has either slow compression,
    or poor compression, so by compression + transmission + decompression time
    CM is favourable.

    > ccm showed what is possible but I think thats all in terms
    > of speed. Lz based tools shows such a big range of ratios
    > and every month there are new interesting ones

    But still I want some development in CM area too.
    Mainly because its rare to see any advanced modelling techniques in LZ.
    ...Probably because people who are able to use such techniques don't see
    any reason to work on LZ

  12. #12
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    325
    Thanks
    18
    Thanked 6 Times in 5 Posts
    If only you could get back to 1.07 speeds with the same-ish(a slightly bigger compressed file is fine!) compression then you'd be onto something

    Nice to see the progression though, keep it up!

  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Intrinsic View Post
    If only you could get back to 1.07 speeds with the same-ish(a slightly bigger compressed file is fine!) compression then you'd be onto something
    I'm already in game! Just check out some digits from Shelwien's test, to kill the myth:
    BALZ 1.07, e: 17,077,650 bytes - 21 sec.
    BALZ 1.07, ex: 16,681,485 bytes - 51 sec.
    BALZ 1.11, e: 16,644,529 bytes - 35 sec.
    BALZ 1.11, ex: 16,216,199 bytes - 83 sec.

    No comments... As usual don't forget about "e" switch...

    Quote Originally Posted by Shelwien
    But still I want some development in CM area too.
    Mainly because its rare to see any advanced modelling techniques in LZ.
    ...Probably because people who are able to use such techniques don't see
    any reason to work on LZ
    I don't think so. First of all, what do you call advanced modeling techniques?
    Secondly, I think authors of LZ-based programs aim on fast decompression, or, at least, keep asymmetric nature of their programs.
    For example, I can add more stronger and advanced LZ-output encoding, but at the cost of 4X-5X slower decompression... Not really worth it in my opinion. At the same time I consider current coding scheme as an advanced one...

  14. #14
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    325
    Thanks
    18
    Thanked 6 Times in 5 Posts
    Yep i was referring to the tests Shelwin posted

    Currently it's around 1.6~ times slower than 1.07, that a pretty big margin to recover, lets hope you can get near it

  15. #15
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    At the compression point of view v1.11 is far more interesting. BALZ v1.07 has weak compression...

  16. #16
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    > First of all, what do you call advanced modeling techniques?

    I already enumerated that somewhere, but well:

    1. Basic statistics
    1.1. CM (context modelling) with simple incremental counters (frequencies) -
    despite common beliefs it can be faster than working with probabilities.
    1.2. CM with shift-based probability counters (fpaq0-like)
    1.3. combinatorical multiplication-based probability counters (frequency simulation)
    1.4. state machine counters
    1.5. delayed counters
    1.6. delayed counters with context interpolation and indirect update
    1.7. (=2.5) +context quantization
    2. Secondary estimation
    2.1. Using a quantized probability in context
    2.2. Nonlinear probability mapping
    2.3. +Interpolation
    2.4. +Indirect update
    3. Prediction merging techniques
    3.1. Switching (eg. by amortized code length)
    3.2. Static linear mixing
    3.3. Adaptive linear mixing
    3.4. +Indirect update
    3.5. Multi-dimensional version of "secondary estimation"
    3.6. Update back-propagation
    4. Precision control
    4.1. Using interval arithmetics to calculate the prediction error (and make a correction)
    4.2. Adaptive correction by using the prediction error in contexts
    5. Parameter optimization techniques
    5.0. Manual
    5.1. Simple bruteforce
    5.2. CM-driven bruteforce (using correlations between parameter set and output size)
    5.3. Bayesian (likelihood-based) (eg. by polynomial approximation)
    6. Symmetry control
    6.1. Blockwise redundancy check
    6.2. Statistical segmentation
    6.3. Adaptive model optimization and parameter values encoding
    7. Applied modelling
    7.1. Hash function design
    7.2. Speed optimization
    7.2.1. Alphabet decomposition by Huffman's algorithm
    7.2.2. Faster processing of probable cases in general
    7.3. Serial improvement (adaptively using a secondary model to determine the design of primary model,
    eg. symbol ranking)
    7.4. Speculative processing (mostly relevant for decoding and threaded implementations)

    ...is what I can think of right away.
    Most of these techniques have to be applied in the compressor for PCs in some way, as they
    usually can be balanced for necessary performance.
    Also surely other techniques would come into view after building an advanced model with most
    of these, but for that this area needs attention and discussion which are currently concentrated
    at basic algorithmic optimizations and various interface/format/design details.

    > Secondly, I think authors of LZ-based programs aim on fast
    > decompression, or, at least, keep asymmetric nature of their
    > programs.

    And I think they're just lazy to learn something new, and just keep themselves busy
    with already known things, which is always possible because perfection is unreachable.

    I mean that LZ decoding wins in speed only until certain level (which is around decoding
    speed of rar), and after that the statistical approach allows for better results,
    of course at the cost of extra programming.

    > For example, I can add more stronger and advanced LZ-output
    > encoding, but at the cost of 4X-5X slower decompression...

    Would you bet that nobody is able to add a stronger model to your compressor,
    while keeping the same (or better) speed?

    Also, as I said, LZ has some drawbacks which are very troublesome to work around,
    like redundancy (decoding of different parsings) and hidden data correlations,
    so its inefficient to further improve LZ's compression after some point.
    Btw, that applies to BWT as well, though at least it doesn't have the alternative
    coding redundancy, but imho it has even smaller area of effective application than LZ.

    > Not really worth it in my opinion. At the same time I
    > consider current coding scheme as an advanced one...

    It is only advanced comparing to LZH, but not much so considering my list.
    And anyway I'd advice to concentrate on speed optimizations, if you want to keep it LZ.
    There're usually a lot of algorithmic optimizations applicable in arithmetic coding,
    like removing multiplications by using logarithmic counters (though that might be patented).

  17. #17
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Well, I believe that certain optimizations/improvements are still possible. Although it is not so easy as someone may think. RZM gives me new level/benchmark to achieve. Compared to genuine ROLZ by Malcolm Taylor BALZ is OK!

Similar Threads

  1. BALZ v1.12 is here!
    By encode in forum Data Compression
    Replies: 23
    Last Post: 10th June 2008, 16:02
  2. BALZ v1.10 is here!
    By encode in forum Data Compression
    Replies: 14
    Last Post: 27th May 2008, 22:51
  3. BALZ v1.05 is here!
    By encode in forum Data Compression
    Replies: 6
    Last Post: 8th May 2008, 23:34
  4. balz v1.04 is here!
    By encode in forum Forum Archive
    Replies: 28
    Last Post: 1st May 2008, 22:41
  5. balz v1.03 is here!
    By encode in forum Forum Archive
    Replies: 43
    Last Post: 24th April 2008, 14:53

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •