I wrote a brief description of BTWS at
http://mattmahoney.net/dc/dce.html#Section_559 (and lots more on BWT and other compression algorithms).
The main difference between BWTS and regular BWT is that BWTS is bijective, which means every possible string is a valid encoding. There is no pointer to the first element. This is an interesting property, but it only improves compression by 4 bytes per block (for blocks up to 4 GB). Splitting the BWT into Lyndon words (cycles with an implied first element) also can have some effect on compression, but experimentally it appears to be negligible.