Given a sequence of bytes S, find a sub-sequence X which has maximum value of (Len(X)-1)*(Count(X)-1)... Then we can use shorter symbol to replace X in S.
Is there any algorithm for that? (with acceptable time and memory usage)
I've read XWRT's code, but it seems to seperate words by semantic, not for generic data?