> I agree, this is also my conclusion for a "small
> dictionnary" algorithm, such as an LZ implementation with,
> for example, a 64KB sliding window.
Just a few weeks ago I implemented a rep-like filter
based on that design, so the window size there was large enough.
> such as deep ROLZ, or a very large LZ window
> then this method seems no longer convenient.
I don't really see why.
The main drawback of the scheme with separate window buffer
and input buffer is data copying from input to window.
But it won't be slow with data already in L1 due to comparisons.
And there'd be enough operations to overlap with that.
On other hand, the ring buffer would have 3 data intervals,
but separate window - only 2.
Code:
[slot][slot][processed|new][slot][slot]
\____________________/\___/\__________/
Also, it would never have the specified capacity
(I mean, LZ with 128M ring buffer nearly never would
be able to match a string from exactly 128M before).
But actually these layouts are quite similar, so
I guess it doesn't matter.
Btw, I had an interesting idea which I didn't implement yet.
We can compress the data in the window using some fast bitcode.
Like that, the window size would effectively increase by 20-50%,
and comparisons might become faster due to less memory access.
The tricky part is selecting the statistical model, though -
the whole idea is to compare compressed strings, and adding a
pass to collect statistics usually won't be a good idea.
But I have an implementation of a "universal" model like that.