
Originally Posted by
Jean-Marie Barone
Hello to all,
I am trying myself to Arithmetic Coding after having read Mark Nelson's tutorial, and I have worked out from scratch 2 functions, SMAC_compress & SMAC_decompress, in Assembly langage (cf SMAC.asm in enclosure if you're interested).
For a start, I'm beginning with a Static order-0 model, and I'm working at byte-level. Later, I plan to update the Model so that every time a byte is encoded, I will decrement its frequency in the dedicated chart and recalculate the Upper & Lower limits in the model. But one step at a time!
I've tested my functions with a small file of 9 Kb (write.exe), and it's working good although the compression ratio & entropy are cheesy, but I know I can't expect much from a static model.
With a larger file, I'm facing the dreadful Underflow situation, and I'm not sure on how to handle it. Concretely, I end up with the following values (hexa):
High=E70001DA Low=E6FFFB93 Range=648
If I apply the "rule", I should get rid of each 2nd bytes, and shift left the following bytes, and would obtain :
High=E701DAFF Low=E6FB9300 Range=64800 UnderflowCount=1
And things could go on like before, except that after the next Most Significant Byte is output from High & Low, I would also output ?? ,errhhh wait, that's my question! Mark Nelson writes that I should output 00 or FF "depending on whether High and Low converged to the higher or lower value". I must admit I don't get it...
Also, isn't there a simpler method to get rid of underlow ? Something like a "Rescale" maybe, I dunno ?
Thanks in advance for your help!