The talks about bitwise BWT on this forum started in 2010.
The output of bitwise BWT looks interesting because you would
only need to encode lengths of runs, ok?
In bitwise case it's possible to transform input as big as 512 MiB using the conventional
5*N bytes of memory, where N is the size of input in bytes,
is it possible to complete the regular bytewise BWT and then transform quickly to the output of bitwise BWT?