Results 1 to 28 of 28

Thread: PPMX 0.06 has been released!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts

    Cool PPMX 0.06 has been released!

    Late night (or already morning) here and I guess time to release something. Well, working under PPM is timeless process. Since the last release, PPMX was many times rewritten from scratch, I tested many ideas, applied many improvements and optimizations. Found that PPMX 0.05 is totally wrong, and I must write a proper enough PPM-based compressor, doesn't matter will it be greatest or not. So, here are the PPMX 0.06. Quite a baseline order-4 PPM, no SEE, no special tricks or gimmicks. Anyway, with that PPM engine I started something essentially new. At least, I did something new in PPM area - I just can't see that PPMd is everywhere and only. If directly compare to PPMd (-o4) PPMX has really different properties:
    • It uses hash tables, so no need in rebuild/flush tree
    • It is much more stable on "bad" files for PPMd - audio, already compressed and some binary files
    • PPMX has no SEE
    • It has more aggressive model update (more non-stationary model)
    • The model set is order-4-2-1-0-(-1), i.e. PPMX has no order-3 model
    • Memory usage is fixed and is about 70 MB
    Having said that PPMX is much more complex project than, say, BCM or even PIMPLE. Despite PPMX.CPP is as small as 8 KB, it is very tricky to remove all redundant computations and make it faster. I spent on PPMX far much more time than on noted BCM. It's both more interesting, more complex and really non trivial task. So, anyway, enjoy new release!

    EDIT: Uploaded PPMX 0.06P4 - A highly optimized build that utilizes SSE2 vectorization. As a note you need a Pentium 4 or newer processor.
    Attached Files Attached Files

  2. #2
    Member
    Join Date
    May 2009
    Location
    China
    Posts
    36
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks, now,test it.

  3. #3
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Ilia!

    Mirror: http://lovepimple.110mb.com/

  4. #4
    Tester

    Join Date
    May 2008
    Location
    St-Petersburg, Russia
    Posts
    182
    Thanks
    3
    Thanked 0 Times in 0 Posts
    small tests:
    # / original size / ppmx 0.06 (freearc) / 7z-ppmd:mem70m_o32
    1 / 815438 / 620934 / 600147
    2 / 45596394 / 20618832 / 18380193
    3 / 53134726 / 13077216 / 12071403
    4 / 51853567 / 24721968 / 23749705
    5 / 10379612 / 4583081 / 4471278
    6 / 590413759 / 259526096 / 241291465

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts

    Wink

    If you compare an order-4 PPM to an order-32 PPMII, don't forget to consider the compression time! Or else you may add a PAQ8 results...

  6. #6
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,052 Times in 565 Posts

  7. #7
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Thanks a lot!

  9. #9
    Member
    Join Date
    May 2008
    Location
    CN
    Posts
    45
    Thanks
    0
    Thanked 0 Times in 0 Posts
    D:\renkoudmp\1306>time 0<cr
    当前时间: 16:04:17.57
    输入新时间:

    D:\renkoudmp\1306>ppmx c hu1306.dmp hu1306.dmp.px
    PPMX 0.06P4 Fast PPM compressor
    Copyright (C) 2010 Ilia Muraviev

    Compressing...
    100%

    393103360 -> 26968237 in 23.8 sec

    D:\renkoudmp\1306>time 0<cr
    当前时间: 16:04:41.64

    while
    D:\renkoudmp\1306>time 0<cr
    当前时间: 15:51:27.97
    输入新时间:

    D:\renkoudmp\1306>pigz -6 hu1306.dmp

    D:\renkoudmp\1306>time 0<cr
    当前时间: 15:51:30.00
    输入新时间:

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Did a complete rewrite of PPMX again. Currently new PPMX completely compatible with PPMX 0.06 (same compression) but performance became notable better:

    Core2Quad, ENWIK8:

    PPMX 0.06 (IA32) -> 21 sec
    PPMX 0.06P4 (SSE2) -> 18 sec
    PPMX 0.07 (IA32) -> 16 sec

    Although, current PPMX 0.07 (IA32) still can be slightly slower on binaries than vectorized 0.06P4 (SSE2). Continue working...

  11. #11
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Recency scaling... Nice idea that actually works, even with text files.

    Each time during encoding, we temporarily increase the recent char count (seen in this context) by some amount or factor, say

    count[r0]+=count[r0]>>2;
    and after encoding, we restore the original count:
    count[r0]=original_count;

    Just collecting and testing some ideas for PPMd-like counters...

  12. #12
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    Interesting idea. Bit histories in paq and zpaq (ICM and ISSE) effectively do this by making the last bit part of the state in addition to the two bit counts.

  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Unfortunately, looks like we have much less options with byte counts. At the same time PPM improvements not ends on SEE. It's good to use SSE as well. As example, if compare simple SSE on last symbol seen (rank 0) versus SEE, simple SSE can give a higher compression gain with the same complexity and speed impact. Probably, in the past, too many authors are turned on PPM's SEE or didn't knew much about SSE. Anyway, currently with PPMX I'm building a very good basement. Any tricks can be added in the future, but currently I want to keep it fast. Still searching for ideas on how to make a simple count-based counter more advanced.

    count[c]=table[count[c]][tot]; // a state machine like stuff probably.

    Just currently, the only idea that really works is that recency/deterministic scaling.

    Also, Dmitry Shakrin make use of some trick in SEE of contexts with unmasked symbols, some ad-hoc (or not) and tricky escape count update:

    // Tabulated escapes for exponential symbol distribution
    static const BYTE ExpEscape[16]={51,43,18,12,11,9,8,7,6,5,4,3,3,2,2,2};

    // ...

    pc->SummFreq=p->Freq+(ns > 1)+ExpEscape[QTable[BSumm >> 8]];

  14. #14
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,052 Times in 565 Posts
    > It's good to use SSE as well.
    > As example, if compare simple SSE on last symbol seen (rank 0) versus SEE,

    rank0 is not necessarily last symbol... eg. it can be also a symbol with
    highest freq.

    > simple SSE can give a higher compression gain with the same
    > complexity and speed impact.

    But escape modelling in PPM is still important.
    I mean, I don't think that with plain escape modelling, but
    added SSE for rank0 symbol, your results would be better than
    with SEE instead.

    > Probably, in the past, too many authors are turned on PPM's SEE or
    > didn't knew much about SSE.

    I think only a few people ever tried to implement a PPM at all.
    And as to SSE, there're too many open questions about it, PPM or not.

    > Still searching for ideas on how to make a simple count-based
    > counter more advanced.

    Its the same thing as with SSE actually - basically you can't.
    (Its not completely impossible, just requires too many multiplications
    and divisions, also too much memory for precise counter values
    (like floats) and it would be still easy to lose precision somewhere).

    Well, plain counts make it easy for SSE - they can be adaptively reordered, and masked
    (at the cost of division per counter to compute the probability, if there's SSE),
    but independent counters are better, and there're too much troubles with both
    speed and precision if you'd try to reorder them.

    So I think that there's no choice - we just have to use a fixed decomposition
    (ideally something like a huffman tree built from order1 stats).
    At least, recently I found a way to use it with a tree instead of hashtables
    (i mean http://encode.su/threads/1173-New-da...for-bitwise-CM).
    Also, as it seems, it also solves the masking problem, as escapes can be
    coded at the leafs (not really escapes, but selection of unused branches),
    and we can continue with that subtree at the lower order, so no masking
    would be necessary.

    > count[c]=table[count[c]][tot]; // a state machine like stuff probably.

    Sure, with that you can get better results than with increments, but
    1. Even a 16-32k table can be surprisingly slow, as L1 data cache
    size is still only 32k on core2. i7 moved previous L2 to L3 and
    added a 256k L2, but that's still not much.
    2. I'm not aware of any theory which could help you to compute such a table.
    I guess its possible to do what I did with that fsm counter here - initialize
    the table with a simple increment and run optimizer on it.
    But it would be slow with such a large table and overfitting is very likely...
    3. Even if you'd manage to find a way to update a single freq like that,
    that would leave all other freqs unchanged (ie probabilities of all other symbols
    would be changed uniformly).
    I mean, ideally you could also modify other counts, but that only increases
    the trouble.

    But as to PPM contexts and state machines, I think it could be interesting
    to merge all the counts into one state.
    I mean, the context state (symbol freqs in ppmd, or as a single value) is
    a lossy representation of context history.
    So it should be possible to explicitly build a lossy compression scheme
    for rank history (keeping symbol codes aside for the tree), fit the
    history code into eg. 24 bits or less, and look up the next state as usual.
    Should be very compact and still possibly better than freqtables
    (as it won't discard the order of symbols in history).

    > Just currently, the only idea that really works is that recency/deterministic scaling.

    Actually its a very simple case of delayed update.
    To improve it, you can store one (you already have it with mtf ranking) or more
    recent symbols in contexts, and do a context-aware probability estimation and update
    with these.
    As a side effect, it might also work as a speed optimization, because you won't
    have to update new contexts.

    > static const BYTE ExpEscape[16]={51,43,18,12,11,9,8,7,6,5,4,3,3,2,2,2};
    > pc->SummFreq=p->Freq+(ns > 1)+ExpEscape[QTable[BSumm >> 8]];

    BSumm is a SEE probability of no-escape (ie symbol coding in binary context)
    from the just processed binary context (its stored in processBinSymbol()).
    So lower esc freq for lower SEE escape probability.

  15. #15
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    The tricky part is, if look carefully on that temporal count increment, it's the same as permanent last count decrement. In other words, recency scaling is kind of continuous rescaling. And here we rescale or decrement not all symbols in a context, but just last seen.
    It's close to PAQ1-like counters, then we decrease the opposite bit count. Here we just decrementing the previous char seen in the context! Nice!

  16. #16
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Some testing results for RKUC v1.04 to see what PPM with sparse (binary in terms of Malcolm Taylor) contexts is:

    No code has to be inserted here.

    It's a big question on how to add sparse contexts to PPM. Unlike CM, with PPM we code a symbol using just one context each time. According to results, most likely RKUC has no SSE, which potentially can use sparse contexts. Also, we can see, something's going on with order-1 and especially order-0 - an extra contexts used there - so order-0 is not real order-0 with "-b" parameter. Probably, Malcolm Taylor added/replaced an order-1 with some small sparse contexts... Or even mixed sparse contexts with order-0/1... Will test some assumptions with PPMX...

  17. #17
    Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    287
    Thanks
    9
    Thanked 33 Times in 21 Posts
    Afaik RKive, PPMZ2 and Co. use some special Local Order Estimation techniques.
    Based on some cost-function or heuristic you skip some contexts instead of escaping from them.

    Most obvious way is to sort contexts by most probable symbol.

  18. #18
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Code:
    Standalone RKIVE v2.00 universal compressor (v1.04).
    Copyright (c) 1999 Malcolm Taylor.
    
    RKUC c|e [args] source [destination]
    
    Args:
            -mXXX   Model size in megabytes.
            -oXX    Model order (0-16).
            -x      Use LOE.
            -b      Use binary contexts.
    As you can see, the LOE is disabled by default.

  19. #19
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Some testing results with PPMX.

    ppmx006
    ppmx006s1 - with sparse context (x.x)
    ppmx006s2 - with sparse context (x..x)
    ppmx006s3 - with sparse context (xx..)

    All versions are exactly the same, except that order-1 model replaced with various sparse models.

    test.exe
    ppmx006 -> 894828
    ppmx006s1 -> 884834
    ppmx006s2 -> 891880
    ppmx006s3 -> 884347 [!]

    calgary.tar
    ppmx006 -> 816877
    ppmx006s1 -> 815323
    ppmx006s2 -> 813619
    ppmx006s3 -> 815907

    reaktor.exe
    ppmx006 -> 2474332
    ppmx006s1 -> 2451736
    ppmx006s2 -> 2470299
    ppmx006s3 -> 2434944

    photoshop.exe
    ppmx006 -> 7496977
    ppmx006s1 ->7479555
    ppmx006s2 -> 7439805
    ppmx006s3 -> 7374416 [!]

    acrord32.exe
    ppmx006 -> 1620570
    ppmx006s1 -> 1588094
    ppmx006s2 -> 1588853
    ppmx006s3 -> 1573459 [!]

    mso97.dll
    ppmx006 -> 1990065
    ppmx006s1 -> 1950584
    ppmx006s2 -> 1962239
    ppmx006s3 -> 1963458

    pak0.pak
    ppmx006 -> 94131081
    ppmx006s1 -> 92351099
    ppmx006s2 -> 92821704
    ppmx006s3 -> 92533142

  20. #20
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    TraktorDJStudio3.exe and sparse contexts:

    No code has to be inserted here.

  21. #21
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Did some experiments with an order-5 model, really dislike the speed impact:

    ENWIK8:

    order-4 -> 25,970,xxx in 16.9 sec
    order-5 -> 24,173,xxx in 24.3 sec

    ppmx006 -> 26,131,726 in 20.9 sec
    ppmx005 -> 22,905,422 in 66.4 sec [!]

    Huge speed impact. The results for calgary.tar are:

    order-4 -> 816,xxx
    order-5 -> 794,xxx

    world95.txt:

    order-4 -> 618,xxx
    order-5 -> 548,xxx

    3200.txt:

    order-4 -> 4,179,xxx
    order-5 -> 3,949,xxx

    fp.log:

    order-4 -> 765,xxx
    order-5 -> 695,xxx

    i.e. it's a new compression level...

  22. #22
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts

    Question

    Continue experimenting with model orders and data structures.

    world95.exe and different model sets:

    ppmx v0.05 -> 478,220 bytes (68 sec and 22,905,422 bytes on enwik
    ppmx v0.06 -> 618,903 bytes (21 sec and 26,131,726 bytes on enwik

    order-4-2-1-0 -> 618,866 bytes (17 sec and 25,980,665 bytes on enwik

    order-5-3-2-1-0 -> 547,381 bytes (24 sec and 23,980,471 bytes on enwik
    order-6-4-2-1-0 -> 515,661 bytes (32 sec and 23,186,823 bytes on enwik
    order-7-4-2-1-0 -> 500,102 bytes (37 sec and 22,995,582 bytes on enwik
    order-8-4-2-1-0 -> 494,055 bytes (42 sec and 23,121,832 bytes on enwik

    order-7-5-3-2-1-0 -> 500,368 bytes (41 sec and 22,616,507 bytes on enwik
    order-8-5-3-2-1-0 -> 491,018 bytes (46 sec and 22,432,311 bytes on enwik
    order-9-5-3-2-1-0 -> 486,643 bytes (50 sec and 22,436,610 bytes on enwik

    Which version you would prefer?

  23. #23
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,052 Times in 565 Posts
    lzma result for enwik8 is 24557177 - for PPM it doesn't make sense to do worse

  24. #24
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,984
    Thanks
    377
    Thanked 352 Times in 140 Posts
    Stuck in the middle with a new PPMX. Did an extreme count of experiments. Mainly for byte-oriented counter improvement. Can't find something really simple that will work. Still choosing between order-5 and order-6, and thinking about fast and simple (or not) SEE...

  25. #25
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,052 Times in 565 Posts
    > Can't find something really simple that will work

    That's a really faulty assumption - there's no simple function
    to magically turn an average codec into good one.
    All known good codecs are fairly complicated, with hardcoded
    handlers for many special cases etc. In fact, most of them
    have multiple layers of compression (ppmd=symbol ranking+CM,
    lzma=LZ77+CM,bsc=BWT/ST+qlfc+CM etc). Only ccm is basically
    single-stage (there're filters though), so its possible to
    improve its speed by adding huffman decomposition etc - but
    still it has special tweaks for a few types of data in the main loop.

    And then, we actually don't need such a simple function either, even
    if it existed - because it would be easy to reverse-engineer and
    understand.

  26. #26
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Shelwien View Post
    And then, we actually don't need such a simple function either, even
    if it existed - because it would be easy to reverse-engineer and
    understand.
    Maybe that's why we need it. Because it's good for education. Please note that encode released some code free and even in cases when he didn't he shared quite a few pieces of knowledge.

    Not everybody fears reversing.

  27. #27
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,052 Times in 565 Posts
    > Maybe that's why we need it. Because it's good for education.

    I meant that there's no benefit for original author, not even fame.
    Example: do you know who invented hashtable?
    http://en.wikipedia.org/wiki/Hash_table#History

    Also that complexity is what makes data compression so interesting -
    its possible to incrementally add parts to a codec and precisely measure
    how it affects the results. Somewhat like a game.
    Now its normal for a complex solution to be better than simple one
    (eg. ppmd vs ppmx), but appearance of such a "magic function" would
    mean "game over".

    > Please note that encode released some code free and even in cases
    > when he didn't he shared quite a few pieces of knowledge.

    Still its a fact that he tries to make some products to sell,
    and I see nothing wrong with that.

    > Not everybody fears reversing.

    That only applies when we know that we didn't make anything exceptional.
    And if you accidentally did, you should be always aware of cases like this:
    http://groups.google.com/group/comp....ffc48cc17cfb0f
    So having a complex app which is not based on a single idea would make your
    life much easier.

  28. #28
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Shelwien View Post
    > Maybe that's why we need it. Because it's good for education.

    I meant that there's no benefit for original author, not even fame.
    Example: do you know who invented hashtable?
    http://en.wikipedia.org/wiki/Hash_table#History
    Fame, money, these are just 2 of many things that make people going.
    Both are important, but not the only ones. And sharing knowledge does not go against either of them.

    Quote Originally Posted by Shelwien View Post
    Also that complexity is what makes data compression so interesting -
    its possible to incrementally add parts to a codec and precisely measure
    how it affects the results. Somewhat like a game.
    Now its normal for a complex solution to be better than simple one
    (eg. ppmd vs ppmx), but appearance of such a "magic function" would
    mean "game over".
    Mhm. You see it as engineering as opposed to a spark of genius. I concur.

    Quote Originally Posted by Shelwien View Post
    Still its a fact that he tries to make some products to sell,
    and I see nothing wrong with that.
    Me neither.

    Quote Originally Posted by Shelwien View Post
    > Not everybody fears reversing.

    That only applies when we know that we didn't make anything exceptional.
    And if you accidentally did, you should be always aware of cases like this:
    http://groups.google.com/group/comp....ffc48cc17cfb0f
    So having a complex app which is not based on a single idea would make your
    life much easier.
    Actually the argument goes both ways. If you state something publicly, you can defend the authorship easier, especially if it catches.

Similar Threads

  1. PPMX v0.05 - new PPM-based compressor
    By encode in forum Data Compression
    Replies: 49
    Last Post: 28th July 2010, 03:47
  2. ppmx v0.04 is here!
    By encode in forum Data Compression
    Replies: 62
    Last Post: 17th January 2009, 14:57
  3. ppmx v0.03 is here!
    By encode in forum Data Compression
    Replies: 13
    Last Post: 1st January 2009, 03:21
  4. PPMX v0.02 is here!
    By encode in forum Data Compression
    Replies: 26
    Last Post: 8th December 2008, 23:20
  5. PPMX - a new PPM encoder
    By encode in forum Data Compression
    Replies: 14
    Last Post: 30th November 2008, 17:03

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •