# Thread: Data compression explained

1. ## Data compression explained

My free online book on data compression.
http://mattmahoney.net/dc/dce.html

I may add some more stuff later. For now it covers most of the basic algorithms like LZ77, BWT, PPM, CM, Huffman, arithmetic coding, JPEG, MPEG, audio, etc, plus some theory. Most of the books on data compression are out of date, so someone needed to write this. Comments welcome.

2. I did a quick look but its only in the last few years people have used 14 files for the Calgary Corpus. In the early years people compressed the 18 files and every one was trying to get the smallest length. Now they only talk about 14 files and don't care about the length that it compressed to.
Instead they see how many bits per byte for the selected files. And then take the average of those 14 values. So to me its very confusing. I know on this forum its the 14 files. However when I compare using BWTS with Mark's I am going to stick to the 18.

But over all good job!
David Scott

3. 5.3.2.5. Results

There's something broken with the table.
Especially, there's 6 columns and 5 headers.

5.7.2

If precomp can't recompress some stream prefectly, but has something close, it generates patches.

4. I've just had a quick look at your text and really appreciate it - i'll just drop a few thoughts shortly, since i'm working actually. Maybe you already got some of that.

1 mixing via 2d SSE, which is used in some bitwise CM compressors

1a with quantised probabilities P(y=1|Q(p0),Q(p1)) (CCM)
http://groups.google.com/group/encod...d17dbd63707f41

1b with quantised bit histories P(y=1|Q(s0),Q(s1)) (M1)

2 linear mixing in probability space, i.e. w0*p0 + p1*p2 + ...

2a two input mixers (1-w)*p0 + w*p1, there exist several update rules, e.g. counter backpropagation, numeric iterative optimization, etc. all with(out), some approximations. You may want to have a look within IRC.
There actually exist alot of implementations.

2b N-ary inputs. I once used it in conjunction with a simple gradient update.

3 Non-flat alphabet decomposition (order-0 Huffman is used in some CTW, order-1 in M1)

4 CTW seems to be missing

5 When talking about parameter tuning you may want to mention the direct application of stochastic search algorithms like GA, which is superiour to "ad-hoc" tuning (BEE, Shelwiens coders, M1).

6 I got some serious refinements to match models (e.g. suffix contexts), which i can publish after making further experiments/implementations.

5. Preformatted text sections (closed in <pre> html tag) are not displayed correctly in my firefox.
You may remove "<font face=arial,sans-serif>" tag or close all "<p>" elements and it will display correctly.

6. Thanks for the comments. I fixed the formatting in Firefox. I guess I should not be so lazy and leave out closing </p> tags. It had looked OK in IE and Chrome.
I also added some updates based on everyone's comments and added an acknowledgement section. I still plan to add sections for more algorithms like CTW, byte pair encoding, numeric codes (unary, rice, mu-law, floating point, etc), JPEG compression, maybe something about noisy channels, archive formats, etc.

7. ## fantastic!

It is absolutely fantastic!

It of course comes at just the right time and at just the right level for my exploration of compression.

Its a lot easier to read if you've tried to pick up the principles first; if I had come across this book first, I think I'd have needed some diagrams of trees for PPM and such.

And now I want SEE and SSE covered to the same depth as arithmetic encoding was! But that's just to suit my personal learning agenda.

I think that HTTP on-the-fly compression with DEFLATE is very common, and often forgotten; presumably gazillions of DEFLATEd bytes are going through the internet every day; probably outnumbering the transfer of archived bytes on the internet.

(Tiny correction: "I frame" in intra-frame, not inter-frame in video)

(H.264 intra-frames are typically half the size of the equivalent JPEG. They delta encode blocks within themselves, which is one of the things that helps achieve this feat.)

I'm not convinced byte pair encoding needs a much mention

8. Here's a version for printing (compact): http://nishi.dreamhosters.com/u/dce2010-02-26.pdf
Also can be used for precomp bechmarks

9. Just fabulous. Shame that I can only learn stuff from there rather than help improving it

10. I'll add my thanks for this, as someone who is fascinated with compression i'll have to find the time to read it.

11. 1. Mistypes

"DMC (dynamic Markov coding) was and described in a paper"
"after disarding"

2. Bytewise coding

Its described as one huge mistake ("A fourth problem is...")
which is plain wrong.

a) Bitwise models can't precisely estimate the probabilities
of bytewise data (like ascii text)
- There're hardware (cpu) precision limits which affect the
results if we'd try to compute bitwise probabilities from
the precise bytewise distribution
(implying practical speed considerations).
- Same applies to updating the bitwise probabilities in a
bytewise fashion (its slow and imprecise)
- Most of bitwise modelling components are not "portable" -
they can't be directly applied for bytewise modelling
because of various technical problems.
- Thus most practical bitwise models produce noticeably
different results with different permutations of byte
alphabet
(eg. see reorder results in http://compressionratings.com/bwt.html#transformation)
Unfortunately its not really a possibility for compression
improvement - more like partial compensation of modelling
imprecision.

Again, there's just no way to make a precise bitwise model
for bytewise source within current CPU architecture limits.

But, unfortunately, people only see that bitwise models are
simpler and show better overall results (meaning, mostly for
data which is not bytewise, like executables).

b) Arithmetic coding of non-binary alphabets is an important
technique, both for speed optimization and algorithm
simplicity.
For example, it allows to encode a random value from a given
range with a single operation (involving divisions, but
that's still faster than multiple rc calls with renorm
loops).
Well, its reasonable that its never used in paq framework,
because its completely incompatible (for example, paq's
carryless rangecoder can't be used for bytewise coding
because of underflow issues), but why its implied that no
other approaches are possible?

c) Non-binary alphabet coding (including bytewise) is
basically a speed optimization method. It can be explained
in terms of using two types of rangecoder operations -
combining intervals without renormalization (computing the
alphabet intervals), and then coding them with
renormalization (actual rangecoding).
Of course, its not a universal method, but if something
isn't compatible with paq framework, it doesn't mean that
its not a useful compression technique.
For example, not long ago we've seen a bytewise order0
rangecoding demo by Cyan with ~100MB/s processing speed,
while bitwise coders can demonstrate ~30MB/s at most.

3. ppmd/ppmonstr

> binary context - in the highest order context only one value
> has appeared (one or more times).

"value"? I'd say byte/symbol.
Also its called "binary" because it virtually contains two
symbols - the one encountered in this context and escape.

> nm-context - two or more values have appeared, and none of
> them have appeared (are masked) in a higher order context.

Actually its just a context with maximum order and >1
different symbols seen in it.
That is, a context won't be processed as "nm" just because
there're no masked symbols.

> -r1 partially rebuilds the model after disarding it when
> memory is used up, which improves compression.

Its not "partially rebuilt".
Instead, ppmd keeps rescaling the counts in the tree
(while removing the symbols for which freq becomes zero,
and their subtrees)
until the tree size becomes smaller than some threshold
(25% afair).

> ppmonstr uses an even more complex SEE context, and
> additionally uses interpolation to smooth some of the
> quantized contexts.

I think ppmonstr only uses SSE.
Also I don't remember any such interpolation there -
although it updates some adjacent entries in SSE table(s),
which might provide a similar effect.

> It also adjusts the prediction for the most probable byte
> using secondary symbol estimation (SSE).

Not only "most probable", but all symbols afaik.

12. ## Matt's updates

(newly added CTW section is removed from diff)

Code:
```diff -rw DCE_2010-02-26/dce.html DCE_2010-02-27/dce.html
17c17
< </font></p><p><font face="sans-serif">Last update: Feb. 26, 2010.
---
> </font></p><p><font face="sans-serif">Last update: Feb. 27, 2010.

512,513c512,520
< a score that combines size and speed, similar to the Maximum Compression
< MFC test, but with minimum speed requirements. The benchmark includes a calculator
---
> a score that combines size and speed. The data includes English text, Windows executable
> code, RGB and grayscale images, CD quality audio, and a mix of data types from
> two video games. None of the test data is compressed (JPEG, PDF, ZIP, etc).
> Each of the 10 data sets contains multiple files. Single file
> compressors are tested on an equivalent
> <a href="http://en.wikipedia.org/wiki/Tar_%28file_format%29">tar</a> file.
> Programs must pass a qualifying round
> with minimum compression ratio and time requirements on a small data set.
> The benchmark includes a calculator

516,517c523,529
< The top ranked programs for the default settings as of Jan. 2010 are nanozip followed by
< freearc, CCM, flashzip, and 7-zip. Runsas is the author of nanozip.
---
> The default scale is "1/10 lower ratio or twice the total time is worth half
> the rating". This is effectively the same formula used by maximumcompression.com.
> Compressed sizes include the size of the (not compressed)
> Windows executable decompression program.
> As of Feb. 2010, 427 qualifying combinations of program versions and options were tested.
> The top ranked programs for the default settings were nanozip followed by
> freearc, CCM, flashzip, and 7-zip. Runsas is also the author of nanozip.
535a548,552
> </font></p><p><font face="sans-serif"><a href="http://metacompressor.com/">Metacompressor</a> is an automated
> benchmark that allows developers to submit programs and test files
> and compare size (excluding the decompresser), compression time, and
> decompression time.
>

840c857,858
< </font></p><p><font face="sans-serif">A fourth problem is that bytewise arithmetic coding is inefficient.
---
> </font></p><p><font face="sans-serif">A fourth problem is that straightforward
> bytewise arithmetic coding is inefficient.

845c863
< up to 256 values. This problem is solved by encoding one bit at a time
---
> up to 256 values. One solution is to encode one bit at a time

1043c1061
< </font></p><h3><font face="sans-serif">4.2. Variable Order Models (DMC, PPM)</font></h3>
---
> </font></p><h3><font face="sans-serif">4.2. Variable Order Models (DMC, PPM, CTW)</font></h3>

1057c1075
< and described in a paper by Gordon Cormack and Nigel Horspool in 1987
---
> described in a paper by Gordon Cormack and Nigel Horspool in 1987

1199c1217
< <li><font face="sans-serif">binary context - in the highest order context only one value has appeared
---
> <li><font face="sans-serif">binary context - in the highest order context only one byte value has appeared

1201,1202c1219
< </font></li><li><font face="sans-serif">nm-context - two or more values have appeared, and none of them have
< appeared (are masked) in a higher order context.
---
> </font></li><li><font face="sans-serif">nm-context - two or more values have appeared in the highest order context.

1248,1249c1265,1275
< from scratch. Optionally, the tree may be partially rebuilt before
< modeling resumes.
---
> from scratch. Optionally, the the statistics associated with each context are scaled
> down and those with zero counts are pruned until the tree size becomes
> smaller than some threshold (25%). This improves compression but takes longer.
>
> </font></p><p><font face="sans-serif">Although bytewise arithmetic encoding can be inefficient, ppmd is in
> practice faster than many equivalent bitwise context mixing models.
> First, a byte is encoded as a sequence of escape symbols followed by
> a non-escape. Each of these encodings is from a smaller alphabet.
> Second, the alphabet within each context can be ordered so that the
> most likely symbols are first. This reduces the number of operations
> in the majority of cases.

1254,1255c1280
< -r1 partially rebuilds the model after disarding it when memory is
< used up, which improves compression.
---
> -r1 says to prune the context tree rather than discard it.

1259c1284,1373
< </font><h3><font face="sans-serif">4.3. Context Mixing</font></h3>
---
> </font><h4><font face="sans-serif">4.2.3. CTW</font></h4>
[...]
> </font></p><h3><font face="sans-serif">4.3. Context Mixing</font></h3>

1263,1267c1377,1383
< algorithms predict one bit at a time (like DMC) except that
< there are multiple models making independent predictions which
< are then combined by weighed averaging. Often the result is
< that the combined prediction is better than any of the individual
< predictions that contribute to it.
---
> algorithms predict one bit at a time (like CTW) except that
> weights are associated with
> models rather than contexts, and the contexts need not be mixed from
> longest to shortest context order. Contexts can be arbitrary functions of the history,
> not just suffixes of different lengths. Often the result is
> that the combined prediction of independent models compresses better than
> any of the individuals that contributed to it.

1269c1385
< </font></p><p><font face="sans-serif">PPM and DMC are based on the premise that the longest context
---
> </font></p><p><font face="sans-serif">DMC, PPM, and CTW are based on the premise that the longest context

3280,3282c3396,3397
< </font></p><p><font face="sans-serif">MPEG-4/AVC (H.264) differs from MPEG-1/2 mainly in that it uses
< a <a href="http://en.wikipedia.org/wiki/Discrete_wavelet_transform">wavelet</a> transform
< instead of a discrete cosine transform (DCT), and supports arithmetic
---
> </font></p><p><font face="sans-serif">MPEG-4/AVC (H.264) differs from MPEG-1/2 in that it
> supports arithmetic
3287a3403
> It also uses a deblocking filter.```

13. I noticed that some images where in bmp format, so i converted it to png.
http://www.sendspace.com/file/xyxs97

14. Yeah, I left them as BMP because PNG wasn't much smaller.

Anyway I was going to post about the updates and the new section on CTW. Thanks for suggestions.

Bitwise and bytewise context models are equivalent because by the chain rule, P(x0..x7) = P(x0)P(x1|x0)P(x2|x0x1)...P(x7|x0...x6). You will get differences if your bit probabilities are not accurate. Alphabet reordering might also affect memory usage. I know about the trick in lpaq9 of swapping bytes 32 and 31, but such differences are small. For BWT it is another matter. It does context sorting based on alphabet order so the order matters. You get better text compression if you group letters, digits, whitespace, etc. because they make similar predictions.

I have to look at Shkarin's paper again. It is based on ppmd/ppmonstr var. H. I don't know about later versions. I am not sure how you do SSE on more than a binary alphabet prediction.

15. > Bitwise and bytewise context models are equivalent because
> by the chain rule,
> P(x0..x7) = P(x0)P(x1|x0)P(x2|x0x1)...P(x7|x0...x6).
> You will get differences if your bit probabilities are not accurate.

Unfortunately its not that simple.
Sure, with plain occurrence counts it doesn't matter,
but with linear counters (p'=p*(1-wr)+y*wr), more so state machines,
there's a visible redundancy in bitwise coding:

For example:
434564 // book1 result of a tuned order0 bytewise model
434873 // book1 result of a tuned order0 bitwise model

(Note that both coders were tuned specifically to book1)

This difference may seem insignificant, but it gets wider
with more orders mixed in.
But usually the bitwise coder would be still chosen in this
case, because it shows much better results for binary files.
However, its a fact that there're cases where a bytewise
model has better precision.

> Alphabet reordering might also affect memory usage.
> I know about the trick in lpaq9 of swapping bytes 32 and 31,
> but such differences are small.

202928 // reorder forward.xlt book1 | paq8px68 -7
202120 // reorder backward.xlt book1 | paq8px68 -7

Is that considered small?.. There likely would be at least
2k difference if we'd bother to actually tune the permutation.

> I have to look at Shkarin's paper again. It is based on
> ppmd/ppmonstr var. H. I don't know about later versions.

Btw, winrar uses ppmd vH as well, not vI.
Also, that paper doesn't really explain how ppmonstr works in detail.
But i have a decompiled version of vI.

> I am not sure how you do SSE on more than a binary
> alphabet prediction.

The most straightforward way would be to do a n-ary mapping,
p.d. to p.d.
But actual implementations are kinda bitwise in that sense
anyway (eg. http://ctxmodel.net/files/ASH/ash04.rar) -
they just use unary coding based on adaptive symbol ranking.
Still, there's no problem with using bitwise probability
estimation and bytewise coding.

16. Originally Posted by Shelwien
202928 // reorder forward.xlt book1 | paq8px68 -7
202120 // reorder backward.xlt book1 | paq8px68 -7
What do those transforms do?
paq8px68 has a lot of language modeling that gets messed up by alphabet reordering.

BTW I added a little about BWT suffix array sorting. I still need to do some more research on this topic though.

17. Its that - http://ctxmodel.net/files/BWT_reorder_v2.rar
Please test it yourself then, as all of your public compressors have
"a lot of language modeling", and demonstrating it with my coders
won't be much of a proof.

18. http://mattmahoney.net/dc/dce.html

Added section 3.4. numeric codes (unary, Rice, Golomb, extra bit).
3.5. archive formats, checksums, encryption.

19. So I guess Shelwien is right that alphabet order does matter. I did an experiment where I transformed book1 by multiplying all bytes by 3 (reversible by multiplying by 171), then compressing with various algorithms that are supposedly not optimized for English text. This transform hurts everything except LZW it seems.

Code:
```x3         original
-------  -------
768,771 768,771 BOOK1
566,269 566,253 flzp (byte aligned LZP)
443,111 442,501 fpaq0p (adaptive order 0)
439,562 432,845 bpe 5000 4096 200 3 (byte pair encoding)
435,363 435,227 fpaq0 (stationary order 0)
431,509 424,522 fpaq0f2 (fast adapting indirect order 0)
335,033 335,033 compress (LZW)
313,761 313,597 zip (LZ77)
312,410 312,391 ha (PPMC order 5)
278,293 276,364 sr2 (symbol ranking)
264,226 264,144 cab -m lzx:21 (LZX)
263,476 261,064 7zip (LZMA)
251,791 251,499 dmc c 100000000  (DMC)
233,666 232,598 bzip2 -9 (BWT+MTF)
222,115 219,434 p6 (neural network CM)
216,412 216,021 ppmd (PPM order 4)
215,162 213,162 bbb (BWT+CM)
211,321 209,514 ctw (context tree weighting)
210,671 208,691 zpaq cmid.cfg  (ICM-ISSE chain, MATCH, MIX)
206,162 203,247 ppmonstr (PPM order 12)```
ppmd and ppmonstr have some language modeling for English because one of the SEE contexts is whether the first 2 bits of the last byte are 00, which distinguishes letters from spaces and punctuation. None of the other algorithms use language modeling. zpaq uses more memory for x3 because context hash tables are indexed on nibble boundaries, but hash tables are only 5% full vs 4% so this should not have much effect.

Aside from ppmd and ppmonstr, the alphabet effect seems to be large on adaptive bitwise contexts like fpaq0p, fpaq0f2, sr2 (for its order 1 model), bwt (sorting is bitwise lexicographical), ctw (uses 8 bit counters), p6, and zpaq cmid.cfg (indirect models) and smaller on stationary bitwise models like fpaq0 and dmc. But there is still a small effect on bytewise models that is hard to explain.

20. > But there is still a small effect on bytewise models that is hard to explain.

Likely because of hashtables, of maybe they're not completely bytewise.
Also, MTF is initialized with specific byte value ranking (bzip2),
and zip/cabarc could use some static huffman tables somewhere.
PPMs (ha) and DMC could be affected because of their initialization
of low orders.

Some more results:

Code:
```x3     original
------ -------
434877 434877 // sh_v1m e book1 | http://compression.ru/sh/coders6c2.rar
214846 214846 // green c book1 | http://ctxmodel.net/files/green/green_v3a.rar
206808 206808 // ppmy book1 | http://compression.ru/sh/ppmy_3c.rar
205417 205417 // ash04a /s0 book1 | http://compression.ru/sh/ash04a.rar
204318 203294 // ash04a book1```
Note that difference shows if we let Ash to use SSE.

Reorder utility for x3:

Code:
```#include <stdio.h>
#include <stdlib.h>
int main( int argc, char** argv ) {
// reorder <coef> <input> <output>
if( argc<4 ) return 1;
FILE* f = fopen( argv[2], "rb" ); if( f==0 ) return 2;
FILE* g = fopen( argv[3], "wb" ); if( g==0 ) return 3;
int h = atoi(argv[1]);
while( 1 ) {
int c = getc(f);
if( c==-1 ) break;
putc( c*h, g );
}
return 0;
}```

21. Bitwise modelling with flat decomposition effectively models guesses on character ranges, e.g. bit 7 distingushes whether or not the next character is within the range 0..127 or 128..255; and so on.

A multi-symbol counter, like P(x=a) = S(a)/T, where S(a) is the occourence count of the symbol a and T the total count of the current context only modifies S(a) and T per update step, but not S(x) for x!=a. In bitwise models however all symbols frequencies get changed, since P(x=a) is expressed as a product of bitwise decisions. Hence the probabilities of other symbols can change more drastically than just S(x)/(T+1) (since S(x) gets modified, too).

The ASCII encoding groups characters with similar meaning (e.g. lower case letters, upper case letters, numbers, ...). Still the ASCII mapping is just a more or less sensful permutation. Modifying the character codes modifies the decomposition, i.e. the mentioned grouping is likely to be distored.

In addition bitwise models contain some redundancy, since some branches in that character guessing are deterministic (e.g. sometimes bit 7 is always 0). That relation can get changed by modifying the permutation, too. Huffman decomposition removes that redundancy (there're no "internal" nodes with just a single leaf).

22. Yep, that's how I created book1x3.

Here are some more results using

reorder forward.xlt book1 book1f
reorder backward.xlt book1 book1b

forward.xlt helps on average but not every time. backward.xlt not so much. Compression options are same as above. Biggest gain is DMC.

Code:
``` 768,771 BOOK1
261,064 book1.7z
213,162 book1.bbb
432,845 book1.bpe
232,598 book1.bz2
264,144 book1.cab
209,514 book1.ctw
251,499 book1.dmc
566,253 book1.flzp
435,227 book1.fpaq0
424,522 book1.fpaq0f2
442,501 book1.fpaq0p
312,391 BOOK1.HA
219,434 book1.p6
216,021 BOOK1.pmd
203,247 BOOK1.pmm
276,364 book1.sr2
335,033 BOOK1.Z
312,502 book1.zip
208,691 book1.zpaq
6,585,783 bytes

768,771 book1f
260,835 book1f.7z
211,629 book1f.bbb
434,801 book1f.bpe
231,416 book1f.bz2
264,121 book1f.cab
209,847 book1f.ctw
237,658 book1f.dmc
571,461 book1f.flzp
435,306 book1f.fpaq0
422,012 book1f.fpaq0f2
443,452 book1f.fpaq0p
312,397 BOOK1F.HA
219,199 book1f.p6
216,026 book1f.pmd
203,222 book1f.pmm
280,404 book1f.sr2
335,033 BOOK1F.Z
313,664 book1f.zip
208,747 book1f.zpaq
6,580,001 bytes

768,771 book1b
262,884 book1b.7z
215,428 book1b.bbb
435,203 book1b.bpe
233,853 book1b.bz2
264,159 book1b.cab
211,483 book1b.ctw
240,622 book1b.dmc
570,290 book1b.flzp
435,321 book1b.fpaq0
430,147 book1b.fpaq0f2
442,851 book1b.fpaq0p
312,405 BOOK1B.HA
220,616 book1b.p6
216,143 book1b.pmd
204,430 book1b.pmm
280,310 book1b.sr2
335,033 BOOK1B.Z
313,624 book1b.zip
210,294 book1b.zpaq
6,603,867 bytes```
However it breaks the language model in paq8px_v67 -6.

Code:
```191,702 book1.paq8px
201,507 book1b.paq8px
200,062 book1f.paq8px```

23. That "forward.xlt" permutation is tuned for BWTmix enwik6 compression,
and "backward.xlt" is a inverse permutation.
So for programs which gain something from forward.xlt there's a
high probability of gaining a few more kbs by real tuning.
Unfortunately, realtime optimization of such alphabet permutation
doesn't seem possible for modern bitwise CMs.

24. http://mattmahoney.net/dc/dce.html

Added 6.1.6. JPEG recompression (Stuffit and PAQ).

Minor rewrites of 6.1.5 (added inverse DCT and color transforms) and 1.3 (better explanation of algorithmic probability).

25. Thanks for Jan's jpeg model description - its really hard to understand the code.
Also, there's that: http://www.winzip.com/wz_jpg_comp.pdf

26. Thanks. I will have to add that one too. It has some nice ideas that could be used in the PAQ model. I'm surprised they published such details.

27. > Thanks. I will have to add that one too.

What about this one then?
http://www.elektronik.htw-aalen.de/p.../packjpg_m.htm

> I'm surprised they published such details.

They advertise it (zipx) as an "open format",
and I guess posting that spec was more convenient than
posting the sources.
http://www.winzip.com/comp_info.htm

28. Matt, can you add Table of contents?

29. Originally Posted by Matt Mahoney
However it breaks the language model in paq8px_v67 -6.

Code:
```191,702 book1.paq8px
201,507 book1b.paq8px
200,062 book1f.paq8px```
Here is result for paq8px_v67 with modified wordmodel, distancemodel and nestmodel to handle reordered alphabet.

paq8px_v67 -6: 191,702 book1.paq8px
paq8px_v67_reorder -6: 191,689 book1.paq8px

30. Yeah, I'll have to add all that too. Meanwhile here is a description of the WinZIP JPEG algorithm (section 6.1.6.3)

http://mattmahoney.net/dc/dce.html

I had to figure out the coder from the (expired) patent. I can see some places where the JPEG algorithm could be improved. The coder was designed for speed in 1987 when table lookup was faster than multiplication. The coder also tightly integrates the context model so they lose flexibility. You can't do context mixing or change the adaptation rate. Also there is some loss due to coarse quantization of the probabilities and update rates. The coder was tuned on binary and grayscale images which is probably not optimal for a lot of other stuff. I had done some experiments earlier on PAQ where I replaced the ICM like context models now used with more stationary (CM) like models and got better JPEG compression. The IBM coder seems to be too fast adapting.

There are a few other strangenesses, like the avg() function not being symmetric with respect to the 2 neighboring blocks, or the coding of mantissa bits not using the previous bits as context. However there are a lot of interesting ideas that could be used in PAQ like using low frequency coefficients in neighboring blocks to predict the current block.

Page 1 of 4 123 ... Last

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•