was experimenting with a bitwise range coder backend for BTIC1H (one of my video codecs). it basically hangs out "behind" the normal bit read/write functions, and behind the Rice coding (the bits for the Rice coder are fed through the range coder). (ADD: basically, a similar idea to in VP8/VP9).
in some offline frame-encoding tests, it nearly halves the bitrate, though results in actual video seem to be a bit more modest.
I am mostly trying to fiddle with it to improve speed, as it is currently kind of slow and barely maintains 30fps for real-time 1080p encoding in this mode (a raw bitstream is closer to 50fps for 1080p encoding, for a single encoder thread on a 3.4GHz AMD Phenom II with PC3-10600 RAM). speed for the encoder sometimes dips below 30fps causing frames to be inserted during capture.
one issue though I have noticed with range coding though is that it seems to be very brittle, as in the slightest tweak to the range-coder breaks ability to decode previous output (this can be noted is not an issue with a raw bitstream).
I am not sure if anyone knows a good solution to making the range-coder less sensitive to slight changes (such as when/how re-normalization happens, ...)?
also, noted, in most tests the branching versions of the logic seem to be faster than the branch-free versions (had been experimenting with making both the read/write bit and re-normalization be branch-free).
ADD:
I have since fiddled it enough so that the branch and branch-free versions are roughly break-even with the branched versions.
a slight optimization was only re-normalizing every 4 bits (*), but this requires limiting the model weights slightly to avoid the range collapsing prior to re-normalization.
current normalization may input/output up to 16 bits at a time.
*: this depends on the number of bits being read/written at a time, which needs to match between encoder and decoder. if the number of bits is not a multiple of 4, it will re-normalize at the end of the sequence of bits.
encoder-side I-frame Mpix/sec still falls slightly below what is needed for 1080p30.
ADD2: partial defeat: need to normalize every bit or else can't get it to decode reliably.
OTOH: testing encoding video to DVD or VCD specs (similar resolution and bitrate), gets pretty decent quality.
ADD3: note that range-coding is optional and experimental, and a raw bitstream will likely remain as the primary format. the format is indicated partly via new TLV tags and a new small header just before the start of the encoded stream. this was chosen over trying to change the bitstream format mid-stream (as was done in BTLZA) as this is an ugly mess.
ADD4: also half-imagining the possibility of trying something similar to an order-2 PPM with 4 bit symbols. this would need about 3kB of context per model. would be encoded using an adaptive coding similar to AdRice (but with symbols limited to 8 bits, and saving 1 bit for the longest prefix, using fixed lookup tables, would need ~1kB for LUT). sane?...
thoughts?...