Conor, I sent you email, not sure if you got it. I would report privately.
Very interesting ideas, nice work. I think there're interesting possibilities along these lines.
I'm trying to test performance, but can't really do it properly because the LZSSE2 encoder takes over 24 hours on some small files.
The LZSSE2 Optimal Parser appears to be O(N^3)
You can test it just by running on a file that's all zero bytes. For other stress tests check out
http://www.cbloom.com/rants.html
There are two standard fixes :
1. When the optimal parser finds a match > 64 bytes (or whatever), just take the match and step ahead the parse position by matchlen. Don't parse at every position inside the match.
2. In the match finder, "amortize" the search; that is, cap the number of tree steps at some max # (64 or whatever)
(this is "cutValue" in LZMA LzFind)
Alternatively, I think you could just use LzFind from LZMA.
It's also a decent compromise to just use a Cache Table Match Finder with an optimal parse. This gives less compression but is a good space-speed tradeoff.
Also may I suggest that any encoder that runs slower than 1 MB/s should print incremental progress reports (% done or something) so we have some idea where it is in long runs
