# Thread: Prob/counter update

1. Just some code examples for FPAQ0P-like compression. Actually LZPM uses this type of coding.

Originally Posted by Basic update
if (y)
pr += ((4096 - pr) >> 5);
else
pr -= (pr >> 5);
Of course, initially pr = 2048;

Originally Posted by Advanced scheme
pr += ((pr < 204 + (((y << 12) - pr) >> 5));
Like you see, the counter update code fits in one line. Also it may work faster than previous code.

The (pr < 204 needed for balancing the probability.

Instead of shifting by 5 we can shift by 4 (aggressive update).

The question is, can we use something more efficient in terms of speed or compression? i.e. more accurate counters emulation on more efficient code?

2. In Rings and Turtle:

pre-calculated:

if (y)
pr[state] += ((256- pr[state]) >> 3);
else
pr[state] -= (pr[state] >> 3);

initially all pr[state]=128;

3. Originally Posted by encode
The question is, can we use something more efficient in terms of speed or compression? i.e. more accurate counters emulation on more efficient code?
im using something more sophiscated and ive explained my method to francesco (here: http://www.encode.su/forums/index.php?action=vthre ad&forum=1&topic=465&page=3#msg6575 - first 43 lines)

its very slightly better than your code and has almost same speed.

basically your update method is optimal when probabilities arent skewed, eg. probabilites are in range <0.05; 0.95>

4. You can use a simple table for counter updates:
int pr_update[4096][2];
usage:
pr = pr_update[pr][y];
Well, this may be less efficient in terms of speed due to table lookup, but IMO it shouldn't make a big difference, if at all.

The big advantage is, that you can pre-compute any sophisticated update scheme and store updated counter values in the table. You can e.g. blend over what you currently use with your shifts by 4 or 5. I use this approach somewhere in my code to apply less aggressive updates if counter is around 2048 (less skewed prob.) while tweaking towards more aggressive updates if probability is more skewed.

5. Originally Posted by Uwe Herklotz
I use this approach somewhere in my code to apply less aggressive updates if counter is around 2048 (less skewed prob.) while tweaking towards more aggressive updates if probability is more skewed.
i am doing totally opposite thing

encode - maybe you should try sse (apm in matt mahoneys terminology)?

6. Originally Posted by donkey7
i am doing totally opposite thing
Of course, this depends on properties of data you want to model. As said, you have the advantage to tweak it according to your needs.

Originally Posted by donkey7
encode - maybe you should try sse (apm in matts mahoneys terminology)?
Surely the best way. But also sure that speed is affected seriously (especially for decompression).

7. Hi Uwe!

Actually, these days I tried many variants, including PAQ1-inspired non-stationary updates, FPAQ0-inspired stationary updates. Current scheme turns out the best! Looks like it works very tricky and combines the power of both stationary and non or semi-stationary update rules. Of course I know about precomputed tables, first versions of LZPM use such thing! After, I turned to this LZMA-like stuff. Having said that latest PAQ versions somewhere uses also similar code ((y<<12)-pr) etc.

8. AMP is not acceptable, since the decompression speed is very important!

9. Originally Posted by encode
Actually, these days I tried many variants, including PAQ1-inspired non-stationary updates, FPAQ0-inspired stationary updates.
Did you try to use different update shift values depending on bit position? For example using more aggressive updates for most significant bits only. IIRC that was useful in some sound or bwt coder, maybe it helps in general.
Originally Posted by encode
Current scheme turns out the best! Looks like it works very tricky and combines the power of both stationary and non or semi-stationary update rules.
It is just a good compromise. Using table-driven updates may help a bit but for big step you need a secondary stage. However, I agree that APM or similar stuff wouldnt really fit into LZPM scheme.

10. Originally Posted by Uwe Herklotz
Did you try to use different update shift values depending on bit position? For example using more aggressive updates for most significant bits only. IIRC that was useful in some sound or bwt coder, maybe it helps in general.
I didnt. I will try however.

From theoretically point of view, this may not help. Both, symbols (literals/match lengths) and match indexes are coded using pure order-1 bit oriented coding. With new LZPM 0.13 I pre initialize:

0XXXXXXXX

the first coded bit of each symbol must have the highest probability of 0. Since 0..255 - literals and 256..511 - match lengths. With literals first coded bit is always 0.

000000000XXXX

With match indexes first 9 bits must have the highest probablility of 0. (13-bit values)

With literals we should have a fast enough update. With indexes we can use less aggressive update with first N bits. Or the opposite thing, with indexes use less aggressive update for the last N symbols - just handle these bits as a noise.

By the way, LZPM 0.13 will be one of the first practical LZ-based compressors which use context coding for match distances/indexes.

Originally Posted by Uwe Herklotz
Using table-driven updates may help a bit but for big step you need a secondary stage. However, I agree that APM or similar stuff wouldnt really fit into LZPM scheme.
Yes, the goal of the LZPM is a FAST decompression. Even compression speed is not so important, with "9" it is just slow, sometimes VERY slow. I think not only APM can give compression gain, firstly for higher compression LZPM should use set of models - i.e. order-1-0, order-2-1-0, instead of just order-1, and in pair with APM, we will give some nice gain. However, we will move LZPM more CM-driven than LZ-driven compressor, and we will loose fast and simple decompression. I think LZPM is OK, LZPM 0.13 is even better - in my machine it has a slightly faster decompression and a bit higher overall compression. Anyway, v0.13 is a next step...

11. Originally Posted by encode
I didnt. I will try however.
From theoretically point of view, this may not help. Both, symbols (literals/match lengths) and match indexes are coded using pure order-1 bit oriented coding. ... Since 0..255 - literals and 256..511 - match lengths.
Ok, then it may be worth to test different update aggressivness for literals/matchlengths. At least I would try different update shift value for msbs of literals.
As I understood you have order-1 context for literal/match flag (first bit) too, right? What is the context for ROLZ match finder? What about using a state-map (maybe combined with bits from order-N) as context for literal/match flag?

12. Originally Posted by Uwe Herklotz
As I understood you have order-1 context for literal/match flag (first bit) too, right?
Yes, the whole 9-bit value (literal or match length) is coded using an order-1 context.

Originally Posted by Uwe Herklotz
What is the context for ROLZ match finder?
With LZPM I make use of an order-1 context to "reduce" offset set. (As original ROLZ-ROLZ3 make use)

Originally Posted by Uwe Herklotz
What about using a state-map (maybe combined with bits from order-N) as context for literal/match flag?
I tried various schemes, including scheme from LZMA. Maybe somewhere I was wrong, but my current scheme turns out the best, and simplest by the way. I found that ROLZ scheme has different properties compared to the LZ77 one. I carefully browsed the LZMA source and tried at almost all tricks with my LZPM, nothing works.

13. ...tried to use different shift values for the first coded bit of symbols (literal/match length) - current scheme again is the best so far...

Will try to use some tricky contexts...

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•