http://ctxmodel.net/files/rcbwt_v0.rar
Contains two proof-of-concept implementations:
- "not working" uses a static o0 rangecoder for storing input data
in memory and compares encoded substrings without decoding.
It actually produces something very similar to BWT on output, but
regrettably the rangecoder doesn't _always_ preserve the lexicographic
order of strings, thus "not working".
- "working" uses a static o0 bit-aligned rangecoder, with ~485k
compressed size for book1.

http://ctxmodel.net/files/rcbwt_v1.rar
A new complete rewrite, with an order1 optimal ordered bitcode
(381.5k on book1) for input data storage.
Really produces a reversible BWT on output... for small enough files.
Also chunked processing is implemented, so memory usage except for
compressed input data is static.

Code:
Input file size            = 768771 (book1)
Memory usage:
 - o1 static bitcode model = 34473984
 - clustering statistics   = 4194304
 - BWT index table         = 65536
 - source data buffer      = 381510
 TOTAL                     = 39115334

Input file size            = 1000000 (enwik6)
Memory usage:
 - o1 static bitcode model = 34473984
 - clustering statistics   = 4194304
 - BWT index table         = 65536
 - source data buffer      = 529780
 TOTAL                     = 39263604
Unfortunately some "features" still have to be fixed:
- "static bitcode model" is too large (mainly due to decoding lookup table)
- Longer files (eg. enwik8) cause a 16bit code overflow
- The case with too many strings in a chunk is not handled yet (infinite loop)

Still, I think its interesting enough as it is.