Seems like a good problem, inspired some interesting experiments.

http://ctxmodel.net/files/guess/bit_guess_v2.rar

Code:

v0 22-10-2009:
1 1 0.99525 728.1; 3221 of 5950 guessed (54.13%)
+ Initial implementation
+ mix(mix(mix(p1,p2),mix(p3,p4)),0.5) based on mix_test
+ Parameter optimization
v1 01-11-2009:
1 1 0.75645 728.5; 3298 of 5950 guessed (55.43%)
0 0 0.35058 898.5; 4009 of 7315 guessed (54.81%)
+ More input data included, model retuned
+ Dumb secondary model added (trying the prev bit if model estimation is near 0.5)
v2 07-11-2009
0 0 0.77592 724.7; 3286 of 5950 guessed (55.23%)
1 1 0.35727 891.6; 4023 of 7315 guessed (55.00%)
or
0 0 0.93299 729.2; 3376 of 5950 guessed (56.74%)
1 1 0.71908 897.1; 4135 of 7315 guessed (56.53%)
+ Completely new model using FP math
+ new "beta counters" for contextual probability estimation
+ new logistic mixer implementation
+ precise float-point math, no tunable tables
+ BFA update
+ weight extrapolation
+ mix(v1P,mix(mix(mix(mix(mix(p1,0.5),mix(p2,0.5)),mix(p3,0.5)),p4),0.5))
(v1 model mixed with the new one)
+ optimization by guess-rate instead of entropy