Sitting over Andrew Polar's carry-less range-coder design, which is an extrapolation of the Guazzo/F0-coder, I became aware of a very good way to explain what a carry-less coder is and why the topic of minimum coder-termination is about the same thing.
Here is the basic coder-loop:
Code:
range = high - low;
/* We're doing limited precision arithmetic until ... */
high = (range * ivl->high) / ivl->total + 0;
low += (range * ivl->low ) / ivl->total + 1;
/* ... we don't have enough range to code the next symbol. */
if ((high - low) < minRange)
/* Therefor we make the coder reset by ... */
high = low;
/* ... executing the loop until low and high are different on enough digits. */
while ((low ^ high) < topRange) {
consumeHiSample(low);
/* expand high with trailing 1s */
high = (high << 8) | 0xFF;
low = (low << 8) | 0x00;
}
What happens here is that in case we run out of precision, a straight reset of the coder is issued. Let's imagine our coder's initial state is low=0,high=FFFFFFFF,minRange=0000FFFF, and we encounter the precision-underflow on something like low=0000F000,high=00010000, not enough for the upcoming interval; the code set low and high equal which forces the loop to shift the entire "low"-register out, and FFs into the "high"-register; voila: low=0,high=FFFFFFFF.
In effect I could also write this:
Code:
range = high - low;
/* We're doing limited precision arithmetic until ... */
high = (range * ivl->high) / ivl->total + 0;
low += (range * ivl->low ) / ivl->total + 1;
/* ... we don't have enough range to code the next symbol. */
if ((high - low) < minRange) {
/* Therefor we make the coder reset by first fullstop() ... */
consumeAllHiSamples(low); // 4 bytes
/* expand high with trailing 1s */
low = (low << 4 * 8) | 0x00000000;
high = (high << 4 * 8) | 0xFFFFFFFF;
/* ... then start() */
low = 0x00000000;
high = 0xFFFFFFFF;
}
else {
while ((low ^ high) < topRange) {
consumeHiSample(low);
/* expand high with trailing 1s */
low = (low << 8) | 0x00;
high = (high << 8) | 0xFF;
}
}
Now what becomes apparent, that if one could find a less waste-full way to do a stop(), one could make a more efficient carry-less coder. The task is to find a valid "low" on reset which still enables the decoder to uniquely decode the previously coded interval ("low" entirely is always the safe bet), but which is as short as possible (to have less waste):
Code:
/* Therefore we make the coder reset by first minimalstop() ... */
X = consumeAsMuchSamplesAsNecessary(low); // 0-4 bytes
/* expand high with trailing 1s */
low = (low << X * 8) | (0x00 + 1 << X * 8 - 1);
high = (high << X * 8) | (0xFF + 1 << X * 8 - 1);
We have there the possibiliy to search for a specific "low" in which the maximum amount of trailing digits is "irrelevant", it can be 0 or 1 or whatever it would still enable the decoder to decode the interval. But as we currently have the possibility to reset the coder to any (!) new state we want, we could also utilize a specific "low":
Code:
while ((high - low) < minRange) {
/* Therefor we make the coder reset by first astop() ... */
consumeHiSample(low); // 1 byte
/* expand high with trailing 1s */
low = (low << 8) | 0x00;
high = (high << 8) | 0xFF;
/* ... then restart() */
low = 0x00000000 + low;
range = 0xFFFFFFFF - low;
high = low + range;
}
What this code does is, recycling the previous "low" (which we know will always allow the decoder to uniquely decode the next interval) which of course requires us to readjust our high accordingly (high + low must be below the maximum value that the precision allows: 0xFFFFFFFF in the example).
And that is actually exactly what this line in the Shkarin/Subbotin-implemention means:
Code:
range = ((-low) & (RANGE_BOT - 1)) << 8;
It means "we reset the range-coder with some small amount of digits, taking into acount that we can recycle the previous low, adjusting just high".
Now it must have become utterly clear, that once you find a sound and effective way to stop an arithmetic-coder design, you are also inherently able to make it an effective carry-less implementation; because carry-less/underflow-free is nothing else than resetting your coder once he can not function anymore.
Okay, I just wanted to draw out this relationship, maybe it's something new for you, maybe it's just a little nicer fairy-tale explanation of what's happening there; I also wanted to make clear I'm not nuts.
Addendum, advanced modifications:
The mechanism above operates on multiple of 8 bits, and makes no attempt to figure out the exact position in the "low"-register where the rest of the digits is "irrelevant" (make no difference for the decoder). This means the Subbotin-Coder can be made more efficient by finding the cutoff-point in "low" and setting the irrelevant bits of "low" to zero, which in effect makes the renewed "range" bigger, which delays the next point-of-reset a little bit longer:
Code:
if ((high - low) < minRange) {
low = low & (0xFFFFFFFF << (32 - minimumHiDigits(low)));
high = 0xFFFFFFFF - low;
}
while ((high - low) < minRange)
...
And that is then pretty much what Charles (not Ashford, Bloom it is
) is doing when he takes the choice between adjusting "low" (we can do that, we're currently resetting the coder!) or better "high", whichever of the two choices leaves us with the bigger range ("high - low") always considering that we have to make sure that our chosen "low" still allows the decoder to decode the previous interval.