# Thread: What is best compression method for permutations

1. ## What is best compression method for permutations

Hi,

What is the best compression method for permutations (1,....,n) when there are only a few elements fullfill the condition e[i] > e[i+1] were 1 <= e[i] <= n?

thanks  Reply With Quote

2. Well, an obvious answer is enumeration.
But to count these specific permutations, we need a precise description of their structure, not in terms of "only a few elements".  Reply With Quote

3. For a general permutation you need lg(n!) bits (assuming uniform probability distribution among them), what can be easily get with enumerative coding, e.g. with rANS-style renormalization to avoid large number arithmetics.
https://en.wikipedia.org/wiki/Lehmer_code

However, the condition suggests that you want to encode some special permutations.
Probably the simplest way is going to nonuniform probabilities in this standard enumerative coding way - somehow model them (Pr(0) should increase I think) and then e.g. analogously use rANS.  Reply With Quote

4. Originally Posted by Shelwien Well, an obvious answer is enumeration.
But to count these specific permutations, we need a precise description of their structure, not in terms of "only a few elements".

- the permutation has just 1 cycle
- the permutation length is 10000 <= n <= 50000
- there are just 255 turning points were e[i-1] > e[i]

with ranking (enumerative coding) i just get (n-1%) bytes for a permutation of length n, but i need to get about 10% savings  Reply With Quote

5. If turning points are just ~1%, maybe just encode them - transformation from identity permutation to yours.
If order of such transformations is not important, try to sort them and encode differences.

If I properly understand, e[i] - e[i-1] is:
a) usually 1 or
b) a small natural number, or
c) negative only in 255 cases.
If so, maybe use entropy coder to choose one of these 3 cases, in a) you can additionally use RLE to directly write lengths of such blocks, in b) maybe some entropy coding, and in c) probably directly write the negative distance.  Reply With Quote

6. I'd also suggest detecting and encoding "turning points" separately.
For example, you can create a bitmap of locations (n-bit string with 255 1s in it),
then encode e[i] values for bitmap[i]=1, with the bounds of (e[i+1]..n]...
or just one of 255 remaining values, if e[i] are all unique.  Reply With Quote

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•