I have notice that some programmers use "zero" for probability calculation (i.e. Igor Pavlov, Shelwien and me) and the others use "one" for probability calculation (i.e. Matt Mahoney). At my point of view it's only a decision. What about the underhood facts? Which solution is offered by information theory? Which solution do you use? Why?