Thanks for comments.
I'll deal with GA soon (hopefully), after adding SR and LZP. But now there're still ways to improve PPM 
Matt:
Thanks for comparisons and script. It will be a helpful tool. But I still think ICM is adding more information than simple frequencies or counters. AFAIU ICM to CM is like fpaq0f to fpaq0p. And there are controversies about fpaq0f being truly order-0. Anyway, it's not a dealbreaker. State maps are simple and elegant way to deal with a lot of problems at once, but certainly not feasible with bytewise coding and probably don't bring much benefit when having also higher order models at the same time.
My coder, without much optimizations, is less than twice as slower than PPMd -o2 (on my AMD E-350), which AFAIR does frequency sorting that should improve speed considerably on compressible data. So that's much better than ZPAQ does, although maybe ZPAQ's JIT isn't optimized for low order coding.
Code:
piotrek@msi-wind:~$ cd Pulpit/enwik/
piotrek@msi-wind:~/Pulpit/enwik$ time ~/NetBeansProjects/cTarsaLZP/dist/Release/GNU-Linux-x86/ctarsalzp c enwik8 enwik8.ari
real 0m42.466s
user 0m40.955s
sys 0m0.932s
piotrek@msi-wind:~/Pulpit/enwik$ time ppmd/PPMd e -m256 -o2 enwik8
Fast PPMII compressor for textual data, variant J, Dec 18 2011
enwik8.pmd already exists, overwrite?: <Y>es, <N>o, <A>ll, <Q>uit?y
enwik8:100000000 >36800776, 2.09 bpb, used: 1.3MB, speed: 4663 KB/sec
real 0m25.485s
user 0m20.577s
sys 0m0.376s
piotrek@msi-wind:~/Pulpit/enwik$ time ~/NetBeansProjects/cTarsaLZP/dist/Release/GNU-Linux-x86/ctarsalzp d enwik8.ari enwik8.dec
real 0m49.282s
user 0m47.367s
sys 0m1.228s
piotrek@msi-wind:~/Pulpit/enwik$ time ppmd/PPMd d enwik8.pmd
Fast PPMII compressor for textual data, variant J, Dec 18 2011
enwik8 already exists, overwrite?: <Y>es, <N>o, <A>ll, <Q>uit?y
enwik8:36800776 >100000000, 2.09 bpb, used: 1.3MB, speed: 4105 KB/sec
real 0m31.470s
user 0m23.221s
sys 0m0.600s
piotrek@msi-wind:~/Pulpit/enwik$
ppmd has additional options like -m and -r, so that may be not
the best result on enwik9.
I've used -o2 and -m256 so that there was no model restart/ cut-off to make comparison as fair as possible.
I have another idea for adaptive disabling. Instead of disabling either o2 model based on total frequency (temporarily do not use models of contexts that were rarely seen in the whole process before), I could compute weighted sum of intervals between o2 context occurences. I would use a simple timestamps to compute interval length. I would eg maintain two weighted interval sums - one will give higher weight to recently incoming interval than another. Then I would have to convert the relation between those two sums to a decision of disabling particular o2 model temporarily. Maybe use some APM to convert history of recent weighted intervals sums relations and current relation of weighted interval sums to a decision about disabling particular 02 model.
Besides, development/ research probably will be slowed down for a while as I'm moving and additionally my netbooks's cooling system is likely broken.