Whether probably to make PAQ asymmetric?
Whether probably to make PAQ asymmetric?
yes, but not in the direction you want
Since in the compression phase the data is known you can speed that process up via predicting not only the next bit yk, but more future bits yk+1 yk+2, etc. In a single threaded implementation this can speed up compression, since the compiler can generate more efficient code by reordering. Multithreading would be possible, too. That approach is not acceptible for decompression. Assuming data to be known via highly probable bits migh be possible, too.
Actually simple CM coders are still much faster than paq,
so adding asymmetry to paq (ie adding some redundant
information which can be used to speed up the decoding)
easily makes sense.
In paq case, the first target would be disabling the
useless filetype-specific submodels: eg. storing the
probabilities generated by submodels without mixing for a
block of data, then only including the sequences which may
improve compression
(its possible to detect useless sequences without actual
adaptive mixing, like by estimating how much a given
sequence can improve the compression when mixed with
perfect weights)
and encode the submodel usage flags along with
the data block.
And certainly it may be possible to go even further,
as a paq submodel itself usually mixes a few different
predictions, which may or may not be beneficial in
specific case.
And then there's parameter tuning which imho also can
be considered a method for asymmetric compression.
Well, surely there's also a different kind of asymmetry -
where encoding is made faster at the expense of decoding -
one example is ABS coding, another may be BWTS vs BWT.
But although such methods may allow to make decoding
indefinitely slow, still, there won't be any real
benefits for encoding, not for paq-like models anyway,
so it doesn't make much sense except for maybe some
cryptography usage.
Bulat Ziganshin toffer Shelwien thanks!!
I laid down to sleep and was thinking. Somehow my thoughts returned to your post and it led me to a generalization of predictor-corrector model.
The predictor-corrector works like this:
Make a model that, based on the past, predicts future and write down hits and misses. In case of PAQ or other CM it's a metamodel, combining a set of simple ones.
Now how about rewriting it this way:
1. Have a set of models that has a property that regardless of the past, for every possible future character there is a model that predicts it.
2. Write down a stream of statements "Use model X for N symbols".
The simplest case, where each model just predicts a constant symbol, regardless of the past, it's RLE.
It's a generalization, because for every predictor p, you can create a set like:
{p, p+1 mod N, ..., p+N-1 mod N} where N is the size of alphabet.
And this model allows huge asymmetry. Like in CM, run a large set if simple models (and metamodels?). Then choose the one that works perfectly on possibly long string and write it down.
Does this idea make sense?
Sure, that's how most asymmetric coders work anyway.
But specific details are troublesome. For one, "Write down a stream of statements" is adding pure redundancy.
So if we'd try greedily selecting the best submodel for each symbol, we won't get any compression at all in the end.
Btw, the usual CM submodel mixing is a more efficient implementation of the same idea - it doesn't add redundancy,
but is symmetric.
Then, another problem is statistics maintenance.
Sure, it does make sense eg. to determine applicable models in paq for each 64k block, store model selecting flags
with the block, and get faster decoding without unnecessary models. But how you would roll back the model statistics
in encoder?
As I wrote, greedy selection with the simplest set of models is RLE. "A,B,C repeated 4 times,D,..."
As far as I understand CM, it's a special case of what I described above. In case of binary alphabet, writing hit-miss-hit-hit is the same as writing p-!p-p-p. So it can work w/out adding any redundancy too, though it gets symmetric. I think PAQ doesn't write hit-miss directly, but I'm pretty sure it's still equivalent.
About models and statistics - I think it's about the same as in CM. The same or similar models should work. And collecting statistics would work the same. The main difference is the here instead of mixing them in predefined fashion and seeing if it works, you look forward and pick the one (or a mixed subset?) that's locally best.
I'm not really sure if it can be made to work well either. Few models to choose from may mean short guesses. Many models add a lot of redundancy. Making models smarter (like making PAQ one of them) slows down compression and decompression, but improves predictions w/out adding redundancy. Is it possible to weigh the choices, so it works well? I'd like to see.
Now that I'm thinking about it, I messed one thing up. RLE is more like coding then modeling, it seems to me it's better to think about it as choosing a model for each character. So it will not make any compression directly, but if models do well, encoding will be able to exploit it, whether it's something like RLE or statistical stuff.
Parser seems critical.
Last edited by m^2; 24th August 2010 at 02:37.
1. Having more models is (in theory) always better for compression, while the switching is done blockwise, or mixing is used instead of switching.
2. CM works by bayesian inference - instead of coding the model selection X and symbol S,
it computes p(S) from the set of p(Xi)*p(Sj) and avoids coding any meta-information.
3. Again, the problem with statistics is how to restore the models. Normally it looks kinda like this: we have a 512M hashtable which gets updates
after processing each symbol (or bit). Say, we processed a block of data with this model and found out that its better to discard it. How do we return
the 512M hashtable to the state before processing this block? (presuming solid compression)
Here is an idea. Try compressing using many different combinations of models and parameters. Pick the best one and store it in the archive. The decompresser then uses the stored model. The ZPAQ format would support this, but so far no compressor does yet. However other programs have been written with this idea, like enc, epm, and m1x2.