1. > compared to standard speed = 40:
> - after hit: prob = prob + (1 - prob) / 40 = 0.5125
> - after miss: prob = prob - prob / 40 = 0.4875
> speed controls how much of the error should be used for update, it isn't a
> direct update value itself.

Well, this calls for some comments...
I think that this prediction/error interpretation is bad for theory,
because its inherently associated with exact prediction of next value in series,
LMC metric, and such - while the only metric a compressor needs is a Shannon metric.

So, there're two interpretations that I generally use, depending on a case:

1. Combinatoric

For binary sequence with only added information that there're n0 zeroes and n1 ones,
there're (n0+n1)!/n0!n1! equiprobable cases, and its easily provable that a counter
like this would be optimal:

p1 = n1/(n0+n1); // static

if( y ) c1++; else c0++; p = c1/(c1+c0); // dynamic

Then, for nonstationary data with local density variations,
a weighted Shannon metric can be used, and so an
interval range sequence for optimal encoding would be

if( y ) c1=c1*w+1,c0=c0*w; else c0=c0*w+1,c1=c1*w; p = c1/(c1+c0);

And as a speed optimization, it can be rewritten to avoid
division:

p = c1/(c1+c0);
if( y ) p'=(c1*w+1)/(c0*w+c1*w+1); else p'=(c1*w+1)/(c0*w+c1*w+1);
---
if( y ) p'=(c1+1/w)/(c0+c1+1/w); else p'=c1/(c0+c1+1/w);
---
if( y ) p'=p*(c0+c1)/(c0+c1+1/w)+(1/w)/(c0+c1+1/w);
else p'=p*(c0+c1)/(c0+c1+1/w);
---
T = c0+c1
T' = c0*w+c1*w+1 = (T+1/w)/(1/w) = T*w+1
wr = 1-(c0+c1)/(c0+c1+1/w) = (1/w)/(c0+c1+1/w) = 1/T'
if( y ) p'=p*(1-wr)+wr; else p'=p*(1-wr);
p = p'; T=T' [

Then, to compensate for supposed completely random bits
(not correlated with recent history), the prediction of
counter can be further improved by static mixing with
a fixed "order-(-1)" distribution:

P = p*(1-mw)+0.5*mw

which means that supposedly counter's prediction "p"
would be applicable for current symbol/bit with
probability (1-mw) and its a completely random bit
otherwise. (Of course, any distribution can be used
instead of {0.5,0.5}, but lets skip it for simplicity).

As experience shows that this suggestion is widely
useful, the mix can be merged with actual counter:

P = p*(1-mw)+0.5*mw
T = T*w+1; wr = 1/T;
if( y ) (p*=(1-wr))+=wr; else p*=(1-wr);
---
P' = ((p*(1-wr)+y*wr)*(1-mw)+0.5*mw) =
= p*(1-wr)*(1-mw)+0.5*mw+y*wr*(1-mw) =
= p*((1-mw)-wr)+0.5*mw+y*wr*(1-mw) =
= ((P-0.5*mw)/(1-mw))*(1-wr)*(1-mw)+0.5*mw+y*wr*(1-mw) =
= (P-0.5*mw)*(1-wr)+0.5*mw+y*wr*(1-mw) =
= P*(1-wr) - 0.5*mw + 0.5*mw*wr + 0.5*mw + y*wr*(1-mw) =
= P*(1-wr) + 0.5*mw*wr + y*wr*(1-mw) =
= P*(1-wr) + (y*(1-mw)+0.5*mw)*wr
// y*(1-mw)+0.5*mw == y?1-0.5*mw:0.5*mw
---
T = T*w+1; wr = 1/T; omp = 0.5*mw*wr;
P = P*(1-wr)+(y?wr-omp : omp);

And that's exactly what I mostly use.

The point here is that such a counter is a generalization
of combinatorical counter and so is able to precisely simulate
the implementation with dumb counts of occurences.
And combinatorical counter is good because it allows to
efficiently encode even completely non-redundant bit sequence.
Its only redundancy is information contained in parameter values
(which include initial values for P and T).

2. Bit sequence hash

When dynamic probability mapping is used (aka SSE etc),
a hash interpretation has a lot of sense too.
Well, its obvious that in fact a "probability" updated
like that is a convolution of bit sequence and so it
highly correlates with current suffix bits, especially
when a fast "update speed" is used (and mostly it is fast enough).

Thus, because of indirect use, a counter-context like that
has a completely different optimization metric comparing to
combinatoric counter and so may (and should actually) be
implemented with different properties.

For example, the combination of a normal counter delayed by
a few bits (that is, updated with bit[x-k] when bit[x] is modelled)
and a static linear mapping ( a[bit[x-k..x-1]]*p+b[bit[x-k..x-1]]*(1-p) )
does show good results.

371699 // test with a single SSE array over normal counter array (fully tuned)
370757 // counters delayed by 3 bits and a static mapping before SSE added

And I consider this improvement fairily major as there's no model structure
change, no contexts or dynamic elements added, and parameters for the first
result were fully optimized by bruteforce search (well, for second as well).

But still, this option is highly dependent on implementation details of the
used dynamic mapping (SSE) and is not understood as well as "normal" counters.
That is, in most general case of this approach counter can be replaced by
static lookup table mapping the bit sequence in context to some precalculated
hash values. But, forgetting about the necessary size of this lookup table
(~2^40 should be a common value), there's no known way for now to optimize
the table contents except by directly checking the compression result with
different settings.

> speed should be lower for probs like 0.01 or 0.99 because with high speed
> occasional negative bit in a equal bits run would ruin the statistics.
> ie. if we have sequence:
> 0000000000000000000000100000000000000000000000000
> then on sequence:
> 0000000000000000000000
> we should consistently lower the speed because if we encounter bit 1
> it shouldn't change statistics very much because the following bits are
> 0000000000000000000000000

A simple combinatoric counter handles this situation pretty well actually.
With completely precise implementation it is able to map strings like
('0' x N1 . '1' . '0' x N2) directly to values from 0 to N-1=N1+N2
(with pre-known N).

Also, of course, sometimes counter tweaks give a gain in compression anyway.
But then, if a gain origin can be analyzed, it is usually possible to
add an explicit support for some special cases and leave the counter alone.
The quoted example is exactly the case like that. 2. Continuing about counters. Look at some draft version of a table building:
Code:
```
for (int i = 0; i < 4096; i++) {
int j = 4095 - i;
tab[i] = i - (i / (18 + (j >> 10)));
tab[i] = i + (j / (18 + (i >> 10)));
}```

usage: pr = tab[y][pr];

LZPM with such table compresses my calgary.tar to 859,191 bytes. Which is close to the double counter variant. Note that on some files the compression is better compared to double counter version. But the main thing is speed - looks like it's faster even than straight fpaq0p variant...  3. Shelwien
please consider that i'm talking about lzp stuff. in my compressor i plan to model literals like ppmd does (for efficient memory utilization - i want to limit number of symbols to ~ 45 in an order-2 context and ~ 15 in an order- 3 context and use fixed size of each context for speed).

binary alphabets would be used only for encoding escapes (lzp and ppm escapes).

lzp flag sequence 000000000100000000 would correspond to:
aaaaaaaaaaaabbbbbbbbbbbb. flaq sequence 00000000011000000 can correspond to aaaaaaaabccccccccc or aaaaaaaaaabaaaaaaaa, i don't store that information anywhere so estimating gets more complicated.

also there are regions in file with better and worse predictability (in sense of lzp matching). i must exploit that effect by sse. 4. @encode:
Wonder if you could try these two:

Code:
```
for( int i = 1; i < 4096; i++ ) {
int SCALElog = 16;
int SCALE = 1<<SCALElog ;
int sp1 = (18 + ((4096-i) >> 10));
int wr1 = SCALE / sp1;
int pr1 = i*(SCALE-wr1) + (i%sp1)*wr1;
int sp2 = (18 + (i >> 10));
int wr2 = SCALE / sp2;
int pr2 = i*(SCALE-wr2) + (4096-(4096-i)%sp2);
tab[i] = __max(1,pr1>>SCALElog);
tab[i] = __min(4095,pr2>>SCALElog);
}

for( int i = 1; i < 4096; i++ ) {
int SCALElog = 16;
int SCALE = 1<<SCALElog ;
int sp1 = (18 + ((4096-i) >> 10));
int wr1 = SCALE / sp1;
int mw1 = 10;
int pr1 = i*(SCALE-wr1) + mw1*wr1;
int sp2 = (18 + (i >> 10));
int wr2 = SCALE / sp2;
int mw2 = 10;
int pr2 = i*(SCALE-wr2) + (4096-mw2);
tab[i] = __max(1,pr1>>SCALElog);
tab[i] = __min(4095,pr2>>SCALElog);
}```

Also consider swapping [y] and [p] indexes. 5. Originally Posted by Shelwien
Wonder if you could try these two:
Something wrong with these two - compressed file becomes even larger than original in both cases.  6. @encode:

I seem unable to avoid mistakes without actually testing the code  7. Originally Posted by Shelwien
OK, wait one minute, I assemble something... And now check out simple and more advanced scheme:
<div class=""jscript""><pre>
for (int i = 0; i < 4096; i++) {
tab[i] = i - N(i);
tab[i] = i + N(4095 - i);
}
// ...
int N(int i) {
int j = (i < 2048) + (i < 1024) + (i < 512) + (i < 256)
+ (i < 128) + 19;

return (i / j);
}
[/code]

With this one, LZPM 0.15 compresses calgary.tar to 859,066 bytes.  8. OK, check out:
fpaq0p-sb.zip  9. Thanks, it helped.

Seems that in that post I really did miss some *wr2.
But now it works and we have a real proof that
randomization in counter update is bad Also notice that speed difference (I compiled all
versions myself with the same options). Its probably
due to [y] and [p] swapped and more cache hits in
my version.

> j = (i<2048)+(i<1024)+(i<512)+(i<256)+(i<128)+19;

I'd suggest to bruteforce these numbers as well,
or you won't know how it should really look.
Well, the best way would be to optimize the whole
table, but it'll really take some time. 10. Originally Posted by Shelwien
Its probably
due to [y] and [p] swapped and more cache hits in
my version.
Yep, seems to be tab[pr][y] should be better. Originally Posted by Shelwien
Id suggest to bruteforce these numbers as well,
or you wont know how it should really look.
Well, I did and will do more testing, to find out whats the best. Like you see my approach is very simple, and looks like it works. Originally Posted by Shelwien
Well, the best way would be to optimize the whole
table, but itll really take some time.
What do you mean? Do you have any additional ideas?  11. Maybe I should calculate tab using 16-bit and more precision (for more accurate division), rounding down to 12... 12. >> I'd suggest to bruteforce these numbers as well,
>> or you won't know how it should really look.
>
> Well, I did and will do more testing, to find out what's the best.

Its actually better to prepare a parameter tuning platform
right from the start...

> Like you see my approach is very simple, and looks like it works.

If you meant your 12-bit counters with lookup update, there's
no wonder actually. As I've already said, 12 bits is ok.
However, I'm not sure if it'll still be ok if you'd try
to use eg. 10 different update tables along with other lookups.

Also there's a significant gain from changing the update speed
according to amount of accumulated statistics - surprisingly,
even under SSE. Simple is not always good as its not adaptive.

> Well, the best way would be to optimize the whole
> table, but it'll really take some time.
>
> What do you mean?

This is a bruteforce-optimized version of coder from fpaq0p-sb_sh.rar

wcc386
399392 // fpaq0p-sb_sh
399083 // current

Btw, I actually don't mind giving you my optimizer,
but its a perl script which patches the executable
and runs it multiple times.

> Do you have any additional ideas?

For what?
I'm not really interested in making things fast at the cost of
precision, so won't be much help here.

Btw, have you considered the fact that LZ scheme can only be used
for "universal" file compression (and without proper format support,
even precomp-style), while good CM techniques can be applied wherever
good model of some process is needed - not limited to compression at all. 13. There's a simple formula for error cost in interval coding.
Supposing that you have k bit precision for range value, then
e = -100*log2( 1-0.5/2^k )
would be redundancy in percents.

Code:
```
In:= Table[ {k, -100*Log[2, 1 - 0.5/2^k]}, {k, 1, 20}]

Out= {{1, 41.5037}, {2, 19.2645}, {3, 9.31094}, {4, 4.58037}, {5,
2.27201}, {6, 1.13153}, {7, 0.564656}, {8, 0.282052}, {9,
0.140957}, {10, 0.0704613}, {11, 0.0352263}, {12, 0.0176121}, {13,
0.00880578}, {14, 0.00440282}, {15, 0.00220139}, {16,
0.00110069}, {17, 0.000550346}, {18, 0.000275173}, {19,
0.000137586}, {20, 0.0000687931}}``` 14. Tested your versions with LZPM. Well, looks like your approach gives no improvement. We get higher compression with calgary.tar and loose it on many other files, and so on. The catch in tuning.
Code:
```
int N(int i) {
int j = 19;
for (int k = 128; k < 4096; k <<= 1)
j += (i < k);

return (i / j);
}```
This one works pretty well. I tried many schemes including Fibonacci, i=i+i, i=i*i, ... This one turns out the best. In addition, I analyzed what's going on with double counter - looks like we cannot do exact approximation with tables. In terms of N(i) function, double counters use a large range of a divisor.

The brute-force stuff can help for a given file - not for all files.  15. > Tested your versions with LZPM. Well, looks like your approach gives no
> improvement.

So what you call an "approach" here? My specific parameter values
was optimized on wcc386 only so obviously are not "common" or anything.
Normally such a optimization should be done on a whole test dataset,
but I don't have that kind of resources to throw away.

> We get higher compression with calgary.tar and loose it on many
> other files, and so on. The catch in tuning.

As I said the point is adaptation.
Meaning faster initial learning and at least some dynamic mixing,
though multidimensional dynamic mapping is always more efficient.

> for (int k = 128; k < 4096; k <<= 1)

The optimized values in my example was
{529, 604, 844, 1868, 1900}
So I'd suggest to concentrate on interval [512..2048]
... though that could be different in fpaq and LZPM.

> This one works pretty well. I tried many schemes including Fibonacci, i=i+i,
> i=i*i, ...

Shamanic ways again > This one turns out the best. In addition, I analyzed what's going on
> with double counter - looks like we cannot do exact approximation with tables.

Here're some ideas about further experiments with double counter.
1. Use a mix of two table-based counters and tune their parameters
to get a good result
2. Try to mix the tables and use a single "mixed" counter
(won't do probably)
3. Still use the mixed table, but add a difference table which would
allow to restore the original counter values
4. Difference table probably can be indexed by a difference counter,
and its precision can be limited, so that mixed counter + diff.state
could be packed into 16 bits. Further optimizations ensue.
5. Updated counter state and probability estimation could be totally
separated (two different tables for estimation and update).

> The brute-force stuff can help for a given file - not for all files.

That's wrong. Even if you use some random values like fibonacci numbers,
its still the same parameter tuning.

Of course, a proper sample should be used for parameter optimization,
what I obviously didn't as my purpose wasn't to supply you with a
perfectly optimized counter, but to show you an example of how you
supposed to do it. 16. OK, thank you for hints! I need just do more experiments.

Overall, I like the table driven stuff - as found it's even faster than pr updates with shifts. Note than goal of the LZPM is a FAST decompression - even at the cost of the compression ratio. During a few weeks or months I will do many experiments (hand crafted brute-force), as always, and decide what do use and what do not.  17. Trying to get into the true brute-force I add to the LZPM an additional parameters and generate a batch file - a few GB long.  18. By the way, your numbers fairly works! Code:
```
int N(int i) {
int j = 18
+ (i < 1900) + (i < 1868) + (i < 844) + (i < 604) + (i < 529);

return (i / j);
}```

With such N(), LZPM compresses calgary.tar to 858,964 bytes.  Code:
```
int N(int i) {
const int a1 = 512;
const int a2 = a1 + 128;
const int a3 = a2 + 256;
const int a4 = a3 + 1024;
const int a5 = a4 + 128;

int j = 17
+ (i < a1) + (i < a2) + (i < a3) + (i < a4) + (i < a5);

return (i / j);
}```
calgary.tar compressed to: 858,883 bytes  20. Or even with 6-point envelope:
Code:
```
int N(int i) {
const int a1 = 256;
const int a2 = a1 + 256;
const int a3 = a2 + 128;
const int a4 = a3 + 256;
const int a5 = a4 + 1024;
const int a6 = a5 + 128;

int j = 18
+ (i < a1) + (i < a2) + (i < a3)
+ (i < a4) + (i < a5) + (i < a6);

return (i / j);
}```
calgary.tar -> 858,880 bytes  21. > Trying to get into the true brute-force I add to the LZPM an
> additional parameters and generate a batch file - a few GB long.

Well, here's my optimizer, with perl included:
It should work right after unpacking by starting 1.bat

But actually its better to completely separate the
specific module you're working on into a standalone
program. Of course, you'd need to prepare a dataset
for it - usually that's done by logging the necessary
values during the normal run.

Meaning, as I said before, that with your LZ its certainly
useful to log the matches you've got from processing your
regenerating these matches.

> By the way, your numbers fairly works!
> int j = 18 + (i<1900) + (i<1868) + (i<844) + (i<604) + (i<529);
> With such N(), LZPM compresses calgary.tar to (858,964 bytes.

Wonder what you tried before then...

> const int a1 = 512;
> const int a2 = a1 + 128;
> const int a3 = a2 + 256;
> const int a4 = a3 + 1024;
> const int a5 = a4 + 128;
> calgary.tar compressed to: 858,883 bytes Well, this is not shamanic yet, just a rough
approximation. Still I'll bet that
"commonly optimal" values, optimized on a large dataset,
wouldn't be powers-of-two or anything else you'd
consider simple. 22. Hi!

Did anybody ever notice this - it's nothing special...

an "adjustment proportional to a fraction of the error" is in fact a non stationary of the form:

p1 = n1 / n

Update, after bit y:

p1' = (c n1 + y) / (c n + 1) = n1' / n'

And now take the difference:

p1' - p1

= (c n1 + y) / (c n + 1) - n1 / n
= (c n1 n + y n - c n1 n - n1) / (n (c n + 1))
= (y - n1 / n) / (c n + 1)

Name the denominator's term be n' = c n + 1 (as stated above), in other words:

n(k+1) = c n(k) + 1

If n is initially zero, we get:

n(0) = 0
n(1) = 1
n(2) = c+1
n(3) = c^2 + c + 1
...
n(k) = c^(k-1) + .... + c^1 + 1 + 0 = sum from i=1 to k c^i

Since c<0 (give more weight to newer events) and letting k go to infinity, we get:

n(k->infinity) = 1/(1-c)

So the denominator is constant after enough iterations. In the "infinite" case we get:

p1' = p1 + (y - p1) * (1-c)

Which could be

p1 += y-p1 >> 5 23. Well, I did notice But

1. Counter gives better predictions when you keep this n(k)
along with the probability (see first post in this thread).
Of course its slower though.

2. Updates like "p1+=(y-p1)>>5" introduce noise into the counter.
See that post 24. Originally Posted by Shelwien
1. Counter gives better predictions when you keep this n(k)
along with the probability (see first post in this thread).
Of course its slower though.
If you made enough adjustments, mathematically inifnity, practically (due to rounding, etc) after a finite number of steps, it doesnt matter. When this counter hasnt gone into steady state this is something like Matts decreasing adaption speed, i cant remember the formula he used though.

BTW:

When looking at my CMs probability output i thought: What would you do, if you were a mixer - Assign high weightes to the most probable predictions (and some logical stuff to avoid a conflict, e.g. order1 gives p1=0, order6 gives p1=1, so trust higher orders)

Has anybody ever tried entropy based mixing? This could be implemented just by a table lookup.

In some experiments i noticed that the "normal" context mixing does a fairly well job at lower orders (say up to 5 or 6). Higher orders offer too few statistics to be modeled the same way, but are usually highly predicitve. Including a match model (type of paq) doesnt actually model this redundancy, it only includes a rough measure of predictability (match length), im sure that the NN does a better job here than my mixer.
I think higher orders can be modeled using a "binary context" with just one symbol, if the context match is long enough.
After some web search i found ppmdet, which does something similar. Charles Bloom also included a nice describtion at his page (which looks promising). What about using something like this for higher order contexts? This could give a speedup, too.
And i noticed something else in my empirical experiments - you can just gain compression by modelling predictive symbols even better. i found no way to model "noisy" symbols in a better way...

EDIT: i agree that something like this is random, due to the modulo:

pr = pr*(SCALE-wr1) + (SCALE-(SCALE-pr)%Speed)*wr1; // y=1

But i cant see how you derive this modulo from an update like above. 25. > > 1. Counter gives better predictions when you keep this n(k)
> > along with the probability (see first post in this thread).
> > Of course its slower though.
>
> to rounding, etc) after a finite number of steps, it doesn't matter.

I have lookup tables for mapping k to n(k) in my coders,
their declaration looks like this:
[...]
DefTbl( 250, 1400 );
DefTbl( 200, 1050 );
DefTbl( 175, 950 );
DefTbl( 150, 800 );
DefTbl( 120, 600 );
DefTbl( 100, 460 );
DefTbl( 80, 450 );
DefTbl( 64, 300 );
DefTbl( 45, 200 );
DefTbl( 40, 200 );
DefTbl( 34, 200 );
DefTbl( 28, 128 );
[...]
first parameter here is an inverse speed (eg. 40 means c=1-1/40 in your terms),
and second is table size... that is, your "infinity".
So its very small actually, that's right.
But still I don't like compression loss in the version with fixed
update speed, so have to deal with additional statistics.
Guess, the "infinity" is not always reached in all counters if file
is not really huge... especially in higher orders.

> When looking at my CM's probability output i thought: What would you do, if
> you were a mixer - Assign high weightes to the most probable predictions
> (and some logical stuff to avoid a conflict, e.g. order1 gives p1=0, order6
> gives p1=1, so trust higher orders)

Well, I'd advice not to try using common sense for model design -
it just doesn't work.

So if you really want something as a target, then gather statistics
from paq8 - add your own counters there and see what paq8 predicts
for given predictions of your submodels.

> Has anybody ever tried entropy based mixing?
> This could be implemented just by a table lookup.

How's that? Accumulate length of code in a context and derive weight from it?
I tried it and it wasn't particularly good.
Also there was an algorithm/compressor named CTW, which attempted something
along that way, if I'm not mistaken.

> In some experiments i noticed that the "normal" context mixing does a
> fairly well job at lower orders (say up to 5 or 6).

How do you know?
That is, how do you determine which prediction quality is good for given submodel?

> I think higher orders can be modeled using a "binary context" with just one
> symbol, if the context match is long enough.

Well, they can. With significant compression loss.
Again, imho the right way to design a compressor is to build something with
which you'd be sure that you cannot afford better quality, and then apply
speed optimizations, which might include skipping or regrouping counter
updates and such. Actually, you can even finish with something like LZ77
along that way. But I don't believe that its possible to get a good model
by improving a normal LZ77, as its much harder to remove speed optimizations
than to apply them.

> What about using something like this for higher order contexts?

That's quite common actually.

> This could give a speedup, too.

Wonder what exactly do you intend to do with your compressor...
For comparisons its better to concentrate on beating paq8, and for any
practical tasks there's no sense in writing a dictionary model as even
simple custom models for specific formats perform significantly better.

> EDIT: i agree that something like this is random, due to the modulo:
>
> pr = pr*(SCALE-wr1) + (SCALE-(SCALE-pr)%Speed)*wr1; // y=1
>
> But i can't see how you derive this modulo from an update like above.

p += x/w
p = (p*w + x - x%w)/w
etc

And quoted formula is equivalent to
p += (1-p)/Speed
That was even proved with real coder in that thread. 26. Concerning the counter updates (i'm busy at the moment, can't write much) - is this correct:

p(k+1) = p(k) + floor[ (y(k) - p(k)) / S ]

You mathematically "emulate" floor by substracting the division's rest:

p(k+1) = p(k) + [ (y(k)-p(k)) - (y(k)-p(k))%S ] / S

This gives:

p(k+1) = p(k) * (1-1/S) + y(k) * (1/S) - [ ((y(k)-p(k))%S ] / S

Where epsilon(k) = - [ ((y(k)-p(k))%S ] / S is the "noise"-term. You eliminate this by just using:

p(k+1) = p(k) * (1-1/S) + y(k) * (1/S)

Is this correct? The second thing is that you include the partital sum of the above mentioned n(k) in your counters? Anything else? 27. > Is this correct?

Yes, mostly. Actually I use a constant parameter instead of that noise.
As explained in first post of this thread.

> above mentioned n(k) in your counters?

Just an index for the lookup table with n(k)=c*n(k-1)+1 values

> Anything else?

Delayed update? 28. Actually every integer division (or right shift) can be written like that, so you can't overcome this noise. I wonder how you want to avoid that? 29. > Actually every integer division (or right shift) can be written like that, so
> you can't overcome this noise. I wonder how you want to avoid that?

By not using any small divisors?

Anyway, its a fact that p-=(p>>5) kind of updates can be rewritten
to use multiplication by inverse and division remainder without any
changes in probability sequence.
And then, its a fact that replacing that remainder by constant improves
the compression. 30. Taking this:

p(k+1) = p(k) * (1-1/S) + y(k) * (1/S) = p(k) * a + y(k) * b

a and b practically are scaled by a factor N = 2^M. So you need to rescale after an adjustment:

a = A/N, b = B/N
(where A and B are in [0,2^M -1])

p(k+1) = floor( (p(k) * A + y(k) * B)/N )

This can be rewritten like in the previous posts. The rescaling by N introduces this noise. I still don't understand how you want avoid this, you still need to do a division/shift (when working with ints).

I don't get it...  #### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•