Hi,
What is the best compression method for permutations (1,....,n) when there are only a few elements fullfill the condition e[i] > e[i+1] were 1 <= e[i] <= n?
thanks
Hi,
What is the best compression method for permutations (1,....,n) when there are only a few elements fullfill the condition e[i] > e[i+1] were 1 <= e[i] <= n?
thanks
Well, an obvious answer is enumeration.
But to count these specific permutations, we need a precise description of their structure, not in terms of "only a few elements".
For example, we can start with only one out-of-order element, etc.
For a general permutation you need lg(n!) bits (assuming uniform probability distribution among them), what can be easily get with enumerative coding, e.g. with rANS-style renormalization to avoid large number arithmetics.
https://en.wikipedia.org/wiki/Lehmer_code
However, the condition suggests that you want to encode some special permutations.
Probably the simplest way is going to nonuniform probabilities in this standard enumerative coding way - somehow model them (Pr(0) should increase I think) and then e.g. analogously use rANS.
Ok here some more information of the struchture of the permutations.
- the permutation has just 1 cycle
- the permutation length is 10000 <= n <= 50000
- there are just 255 turning points were e[i-1] > e[i]
with ranking (enumerative coding) i just get (n-1%) bytes for a permutation of length n, but i need to get about 10% savings
If turning points are just ~1%, maybe just encode them - transformation from identity permutation to yours.
If order of such transformations is not important, try to sort them and encode differences.
If I properly understand, e[i] - e[i-1] is:
a) usually 1 or
b) a small natural number, or
c) negative only in 255 cases.
If so, maybe use entropy coder to choose one of these 3 cases, in a) you can additionally use RLE to directly write lengths of such blocks, in b) maybe some entropy coding, and in c) probably directly write the negative distance.
I'd also suggest detecting and encoding "turning points" separately.
For example, you can create a bitmap of locations (n-bit string with 255 1s in it),
then encode e[i] values for bitmap[i]=1, with the bounds of (e[i+1]..n]...
or just one of 255 remaining values, if e[i] are all unique.