# Thread: What is best for a pure Entropy Encoder?

1. ## What is best for a pure Entropy Encoder?

When people write compression code. They general have several stages with often the last stage being an entropy compressor. The question is what is the best approach
for this last stage. Do you general assume a nonstationary data stream at this point so that you don't use a pure entropy coder so that you make up for the mistakes in models leading to this last stage by tuning it to various files. Or do you make it as pure as possible and tune the preceding stages to give a more stationary stream of data to this final state of entropy compression. Or does one just take a set of files and try to tune the whole set of passes to make it work well on some set of data.
I am just curious what other people think. I feel most know my thoughts in this area so please feel free to discuss your own. Or do most people think the same. Also do you like to work in binary for last stage or some larger set?

2. I think your question aims at something different but for me, compression is always modeling+coding. Coding should be done with Arithmetic Coding or something familiar.

3. I've never understood what "Entropy Encoding" is other than to encode a pure entropy file so it's smaller (which it is not, that means I get to call it whatever I want when I get around to writing it). However, wikipedia claims "In information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium." ... and goes on to state that Huffman & Arithmetic are Entropy encoding.

I'm assuming that by this definition, it's any encoding method that does not use modelling ? Please feel free to correct me on this.

4. Well, I think an entropy coder is given the probabilities of symbols in alphabet at any stage of compression process and must produce the unambiguous code word from it. It does not have any memory, besides some state independent of input alphabet (ie low, range, compressed output, etc).

5. Well, psrc has a secondary model in it, but I believe that its still an entropy coder
As I see it, the model outputs some quantitative estimations of amount of information in data symbols
(entropy = -information; so its frequently used as a synonym, to avoid tautology)
Then we need an algorithm to efficiently transform that information into a number.
But computers have limited precision, so we need an approximation which would be close
to http://en.wikipedia.org/wiki/Entropy...rmation_theory) for the whole input.
I guess, the algorithm that approximates Shannon Entropy for given model output is
called entropy coder.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•