Results 1 to 13 of 13

Thread: Prob/counter update

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Just some code examples for FPAQ0P-like compression. Actually LZPM uses this type of coding.

    Quote Originally Posted by Basic update
    if (y)
    pr += ((4096 - pr) >> 5);
    else
    pr -= (pr >> 5);
    Of course, initially pr = 2048;

    Quote Originally Posted by Advanced scheme
    pr += ((pr < 204 + (((y << 12) - pr) >> 5));
    Like you see, the counter update code fits in one line. Also it may work faster than previous code.

    The (pr < 204 needed for balancing the probability.

    Instead of shifting by 5 we can shift by 4 (aggressive update).

    The question is, can we use something more efficient in terms of speed or compression? i.e. more accurate counters emulation on more efficient code?


  2. #2
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    In Rings and Turtle:

    pre-calculated:

    if (y)
    pr[state] += ((256- pr[state]) >> 3);
    else
    pr[state] -= (pr[state] >> 3);

    initially all pr[state]=128;

  3. #3
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,475
    Thanks
    26
    Thanked 121 Times in 95 Posts
    Quote Originally Posted by encode
    The question is, can we use something more efficient in terms of speed or compression? i.e. more accurate counters emulation on more efficient code?
    im using something more sophiscated and ive explained my method to francesco (here: http://www.encode.su/forums/index.php?action=vthre ad&forum=1&topic=465&page=3#msg6575 - first 43 lines)

    its very slightly better than your code and has almost same speed.

    basically your update method is optimal when probabilities arent skewed, eg. probabilites are in range <0.05; 0.95>

  4. #4
    Member
    Join Date
    Mar 2007
    Posts
    34
    Thanks
    0
    Thanked 0 Times in 0 Posts
    You can use a simple table for counter updates:
    int pr_update[4096][2];
    usage:
    pr = pr_update[pr][y];
    Well, this may be less efficient in terms of speed due to table lookup, but IMO it shouldn't make a big difference, if at all.

    The big advantage is, that you can pre-compute any sophisticated update scheme and store updated counter values in the table. You can e.g. blend over what you currently use with your shifts by 4 or 5. I use this approach somewhere in my code to apply less aggressive updates if counter is around 2048 (less skewed prob.) while tweaking towards more aggressive updates if probability is more skewed.

  5. #5
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,475
    Thanks
    26
    Thanked 121 Times in 95 Posts
    Quote Originally Posted by Uwe Herklotz
    I use this approach somewhere in my code to apply less aggressive updates if counter is around 2048 (less skewed prob.) while tweaking towards more aggressive updates if probability is more skewed.
    i am doing totally opposite thing


    encode - maybe you should try sse (apm in matt mahoneys terminology)?

  6. #6
    Member
    Join Date
    Mar 2007
    Posts
    34
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by donkey7
    i am doing totally opposite thing
    Of course, this depends on properties of data you want to model. As said, you have the advantage to tweak it according to your needs.

    Quote Originally Posted by donkey7
    encode - maybe you should try sse (apm in matts mahoneys terminology)?
    Surely the best way. But also sure that speed is affected seriously (especially for decompression).

  7. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Hi Uwe!

    Actually, these days I tried many variants, including PAQ1-inspired non-stationary updates, FPAQ0-inspired stationary updates. Current scheme turns out the best! Looks like it works very tricky and combines the power of both stationary and non or semi-stationary update rules. Of course I know about precomputed tables, first versions of LZPM use such thing! After, I turned to this LZMA-like stuff. Having said that latest PAQ versions somewhere uses also similar code ((y<<12)-pr) etc.


  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    AMP is not acceptable, since the decompression speed is very important!

  9. #9
    Member
    Join Date
    Mar 2007
    Posts
    34
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by encode
    Actually, these days I tried many variants, including PAQ1-inspired non-stationary updates, FPAQ0-inspired stationary updates.
    Did you try to use different update shift values depending on bit position? For example using more aggressive updates for most significant bits only. IIRC that was useful in some sound or bwt coder, maybe it helps in general.
    Quote Originally Posted by encode
    Current scheme turns out the best! Looks like it works very tricky and combines the power of both stationary and non or semi-stationary update rules.
    It is just a good compromise. Using table-driven updates may help a bit but for big step you need a secondary stage. However, I agree that APM or similar stuff wouldnt really fit into LZPM scheme.

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Quote Originally Posted by Uwe Herklotz
    Did you try to use different update shift values depending on bit position? For example using more aggressive updates for most significant bits only. IIRC that was useful in some sound or bwt coder, maybe it helps in general.
    I didnt. I will try however.

    From theoretically point of view, this may not help. Both, symbols (literals/match lengths) and match indexes are coded using pure order-1 bit oriented coding. With new LZPM 0.13 I pre initialize:

    0XXXXXXXX

    the first coded bit of each symbol must have the highest probability of 0. Since 0..255 - literals and 256..511 - match lengths. With literals first coded bit is always 0.

    000000000XXXX

    With match indexes first 9 bits must have the highest probablility of 0. (13-bit values)

    With literals we should have a fast enough update. With indexes we can use less aggressive update with first N bits. Or the opposite thing, with indexes use less aggressive update for the last N symbols - just handle these bits as a noise.

    By the way, LZPM 0.13 will be one of the first practical LZ-based compressors which use context coding for match distances/indexes.

    Quote Originally Posted by Uwe Herklotz
    Using table-driven updates may help a bit but for big step you need a secondary stage. However, I agree that APM or similar stuff wouldnt really fit into LZPM scheme.
    Yes, the goal of the LZPM is a FAST decompression. Even compression speed is not so important, with "9" it is just slow, sometimes VERY slow. I think not only APM can give compression gain, firstly for higher compression LZPM should use set of models - i.e. order-1-0, order-2-1-0, instead of just order-1, and in pair with APM, we will give some nice gain. However, we will move LZPM more CM-driven than LZ-driven compressor, and we will loose fast and simple decompression. I think LZPM is OK, LZPM 0.13 is even better - in my machine it has a slightly faster decompression and a bit higher overall compression. Anyway, v0.13 is a next step...

  11. #11
    Member
    Join Date
    Mar 2007
    Posts
    34
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by encode
    I didnt. I will try however.
    From theoretically point of view, this may not help. Both, symbols (literals/match lengths) and match indexes are coded using pure order-1 bit oriented coding. ... Since 0..255 - literals and 256..511 - match lengths.
    Ok, then it may be worth to test different update aggressivness for literals/matchlengths. At least I would try different update shift value for msbs of literals.
    As I understood you have order-1 context for literal/match flag (first bit) too, right? What is the context for ROLZ match finder? What about using a state-map (maybe combined with bits from order-N) as context for literal/match flag?

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Quote Originally Posted by Uwe Herklotz
    As I understood you have order-1 context for literal/match flag (first bit) too, right?
    Yes, the whole 9-bit value (literal or match length) is coded using an order-1 context.

    Quote Originally Posted by Uwe Herklotz
    What is the context for ROLZ match finder?
    With LZPM I make use of an order-1 context to "reduce" offset set. (As original ROLZ-ROLZ3 make use)

    Quote Originally Posted by Uwe Herklotz
    What about using a state-map (maybe combined with bits from order-N) as context for literal/match flag?
    I tried various schemes, including scheme from LZMA. Maybe somewhere I was wrong, but my current scheme turns out the best, and simplest by the way. I found that ROLZ scheme has different properties compared to the LZ77 one. I carefully browsed the LZMA source and tried at almost all tricks with my LZPM, nothing works.


  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    ...tried to use different shift values for the first coded bit of symbols (literal/match length) - current scheme again is the best so far...

    Will try to use some tricky contexts...

Similar Threads

  1. zpaq 1.02 update
    By Matt Mahoney in forum Data Compression
    Replies: 11
    Last Post: 10th July 2009, 01:55
  2. Advanced counter/predictor
    By encode in forum Data Compression
    Replies: 22
    Last Post: 28th May 2008, 00:43
  3. Ocamyd Update!
    By LovePimple in forum Forum Archive
    Replies: 2
    Last Post: 29th March 2008, 23:28

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •