> how to update the probability that the next value will be a 0 or a 1 ?
Suppose you've encountered n0 zeroes and n1 ones,
then the entropy of the processed string is
l[p0_,n0_,n1_]:=n0*Log[2,p0]+n1*Log[2,1-p0]
and by solving l'(p)=0 we'd be able to find the p value
which minimizes the entropy.
http://www.wolframalpha.com/input/?i...%3D%3D0%2Cp%5D
which is p=n0/(n0+n1).
But its a posterior estimation.
(ie its the best approximation of static generator's parameter, given the known bits,
but its not necessarily optimal for prediction of the next bit.)
And for a prior estimation there's a useful method based on likelihoods:
Code:
// the probability of a string with n0 zeroes and n1 ones:
c[n0_,n1_]:=1/Binomial[n0+n1,n0];
// the probability of the next string being the one with extra zero:
p=c[n0+1,n1]/(c[n0+1,n1]+c[n0,n1+1])
http://www.wolframalpha.com/input/?i...%2Bn0%5D%29%5D
Same method actually can be also used with string probabilities based on entropy
with substituted posterior p=n0/(n0+n1), ie
c[n0_,n1_]:=1/2^l[n0/(n0+n1),n0,n1] =(n0/(n0+n1))^n0*(n1/(n0+n1))^n1
but entropy formula is an approximation (it assumes p=n0/(n0+n1), while actual information
is that there're n0 zeroes and n1 ones; these are not the same at all.),
so we'd end up with a pretty weird estimation:
p=1/(1+(n0^n0*(n1+1)^(n1+1)/(n0+1)^(n0+1)/n1^n1))
Although Shkarin was actually advertising it (as "MP-estimator") -
http://encode.su/threads/997-Counter...ll=1#post20146
and its supposedly better than plain (n0+d)/(n0+n1+2*d) in practical use.
(see http://nishi.dreamhosters.com/u/mp2a.rar)
Actually this approach can be improved even further by taking into
account the probability distribution of p values.
For example,
Code:
c[n0_,n1_]:=Integrate[ PDF[BetaDistribution[1,1],p]*(1-p)^n1*p^n0, {p,0,1},
Assumptions->{Element[p,Reals],Element[n0,Integers],Element[n1,Integers],n0>=0,n1>=0}]
p=FullSimplify[c[n0+1,n1]/(c[n0+1,n1]+c[n0,n1+1])]
gives p=(n0+1)/(n0+n1+2)
and
Code:
c[n0_,n1_]:=Integrate[ PDF[BetaDistribution[1/2,1/2],p]*(1-p)^n1*p^n0, {p,0,1},
Assumptions->{Element[p,Reals],Element[n0,Integers],Element[n1,Integers],n0>=0,n1>=0}]
p=FullSimplify[c[n0+1,n1]/(c[n0+1,n1]+c[n0,n1+1])]
gives p=(n0+.5)/(n0+n1+1) (the Krichevski-Trofimov estimator)
But its commonly more interesting to use secondary statistics and numerical integration there.
Its actually completely practical if we'd apply these formulas to initialization of a
state-to-probability map or some such.
Btw, the plot of BetaDistribution[1,1] is this: http://www.wolframalpha.com/input/?i...2C0%2C1%7D+%5D
(ie all p values have the same probability)
and BetaDistribution[1/2,1/2] is this: http://www.wolframalpha.com/input/?i...2C0%2C1%7D+%5D
(ie probabilities near 0 and 1 are more likely to appear)
Of course, when we know that the data is not random, the latter is a more reasonable assumption.
But we'd only know the actual distribution after collecting the statistics from various contexts in a real CM.
--------------
1. This was originally a comment on Cyan's blog, but then I tried to fix some typos...
2. "Mathematica" syntax is used.