Some papers actually do exist:
scholar.google.com
Maybe there's something useful? Like
https://arxiv.org/pdf/1402.3392.pdf ?
> comparing to large alphabet AC/RC, it pessimistically requires alphabet size multiplications/symbol -
> rANS requires just one instead (and it can be approximated with ~2 additions), at cost of backward encoding.
Actually AC is compatible with the same counter update methods as rANS -
either two additions with a "frequency table", or vector multiplication, like
what fgiesen used for LZNA nibble coding.
AFAIK, the number (and type) of operations required for RC and rANS are the same,
and most speed optimization tricks do apply to both - although RC encoding is more complicated
because of carry handling. Just that nobody implemented a vector RC with optimizations for modern cpus.