Results 1 to 23 of 23

Thread: Advanced counter/predictor

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts

    Question Advanced counter/predictor

    Let us consider following code:
    Code:
    class TPredictor {
    private:
      unsigned short pr;
    
    public:
      TPredictor(): pr(2048) {}
    
      int P() {
        return (pr);
      }
    
    void Update(int y) {
    if (y)
          pr+=((4095-pr)>>5);
    else
          pr-=(pr>>5);
      }
    };
    This is a FPAQ0P-styled predictor that predicts the probability of a one bit.
    And here is a modification a la FPAQ0M:
    Code:
    class TPredictor {
    private:
      unsigned short p1;
      unsigned short p2;
    
    public:
        TPredictor(): p1(2048), p2(2048) {}
       
      int P() {
        return ((p1+p2)>>1);
      }
    
      void Update(int y) {
        if (y) {
          p1+=((4095-p1)>>3);
          p2+=((4095-p2)>>6);
        }
        else {
          p1-=(p1>>3);
          p2-=(p2>>6);
        }
      }
    };
    For comparison, here is a classical example of stationary/non-stationary (semi-stationary) predictor:
    Code:
    class TPredictor {
    private:
      unsigned char n0;
      unsigned char n1;
    
    public:
        TPredictor(): n0(0), n1(0) {}
    
    int P() {
    return (((n1+1)<<12)/((n0+n1)+2));
      }
    
      void UpdateStationary(int y) {
         if (y) {
          if (++n1>=255) {
            n0>>=1;
            n1>>=1;
          }
        } 
        else {
          if (++n0>=255) {
            n0>>=1;
            n1>>=1;
          }
        }
       }
    
      void UpdateNonStationary(int y) {
        if (y) {
          if (n1<255) n1++;
          if (n0>2) n0=(n0>>1)+1;
        } 
        else {
          if (n0<255) n0++;
          if (n1>2) n1=(n1>>1)+1;
        }
      }
    };
    The question is ? how we can add a SSE/weighed mixing inside this class ? i.e. no additional tables, no additional classes ? all stuff inside one class. The second example needs four bytes per bit prediction; I think we may use eight bytes ? i.e. we may add one more 32-bit integer or two 16-bit integers, etc. What we possibly can:
    + Add weighed mixing of both probabilities (fast and slow ones)
    + Add weighed mixing of one or two probabilities with static model (prob. = 0.5)
    + Add kind of SSE/APM inside
    Any ideas/suggestions?

  2. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Maybe it's a good idea to use approach similar to FPAQ0F2:
    Code:
        int n=t[cxt]&255, p=t[cxt]>>14;  // count, prediction
        if (n<limit) ++t[cxt];
        t[cxt]+=((y<<18)-p)*dt[n]&0xffffff00;
    In other words:
    Code:
    pr+=((y<<N)-pr)*wt; // or *rate

  3. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    1. Read the "Parameter optimization" thread... there're examples of
    some advanced counters.
    2. SSE incapsulation isn't a good idea, because SSE is really useful
    only with good contexts.
    3. A counter is just an independent model for context history.
    So you can strip some example file into bit sequences, and make
    a good model for compression of that and then optimize it until
    it'll fit your restrictions for size and speed.

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts

    Question

    OK, thanks!

    I just interested what's going on inside FPAQ0F2, why so called "bit history" crazily helps on enwik9, even beating FPAQ0MW?

    I tested FPAQ0F2 with an order-1 coder - nothing happens, compression even worser compared to FAPQ0M. What is "bit history" - if I'm correct - we keep count of this context seen and each time increase this value by one until some defined "limit" was reached. This count affects on update speed - looks like initially the update is the most aggressive - if counter reached the limit the update is slowest. If this is a correct explanation what's inside, by itself it may NOT be called order-1 coder, however, why this thing not works if we apply such technique to an order-1 coder?

    I noticed that, generally speaking, FPAQ0P-like model works poorly at order-0, but more than correctly at pure order-1. Maybe FPAQ0F2 just fixes some behavior at order-0 but completely not works with order-1?


  5. #5
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 151 Times in 18 Posts
    Quote Originally Posted by Encode
    i.e. no additional tables, no additional classes
    FPAQ0F is using additional tables - the bit history.

    Quote Originally Posted by Shelwien
    SSE incapsulation isn't a good idea, because SSE is really useful
    only with good contexts.
    Exactly. It does not make too much sense to use SSE inside your everyday counter - you should use it to e.g. 'stabilize' the output of multiple counters or sub-models.

    Quote Originally Posted by Encode
    This count affects on update speed - looks like initially the update is the most aggressive - if counter reached the limit the update is slowest. If this is a correct explanation what's inside
    The explanation lacks an important detail. FPAQ0F's counter uses the bit history as an additional context. Knowing this, it's easy to explain it's better performance. Although FPAP0F is 'order-0', it's performance and memory requirements are closer to an order-1 coder (because of the bit history).

    Maybe FPAQ0F2 just fixes some behavior at order-0 but completely not works with order-1?
    It's obvious, at order-1 FPAQ0F shouldn't work that good - the bit histories take too long to 'warm up'.

  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Christian View Post
    FPAQ0F
    Important note: FPAQ0F is NOT FPAQ0F2 - check both sources...

  7. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Got it:
    Code:
      int p() {
        return sm.p(cxt<<8|state[cxt]);
      }
      void update(int y) {
        sm.update(y, 90);
        int& st=state[cxt];
        (st+=st+y)&=255;
        if ((cxt+=cxt+y) >= 256)
          cxt=0;
      }
    Cheating indeed... While I thinked that this variable update rate helps...

  8. #8
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 151 Times in 18 Posts
    Important note: FPAQ0F is NOT FPAQ0F2 - check both sources...
    Of course it's not the same - but it's very similar. You may as well replace FPAQ0F with FPAQ0F2 in my previous text.

    Additionally, you might want to check FPAQ0F2's sources for this line...

    Code:
    U32 *t;       // cxt -> prediction in high 24 bits, count in low 8 bits

  9. #9
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by Christian View Post
    Additionally, you might want to check FPAQ0F2's sources for this line...

    Code:
    U32 *t;       // cxt -> prediction in high 24 bits, count in low 8 bits
    Nothing special - we just pack our probability and context seen count into one 32-bit integer. Nothing about bit histories or tables...

  10. #10
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Variable update speed really doesn't help at all (the gain is tiny, especially with lower order models). If you want to benefit from a longer bit history, use a delayed counter, as Shelwien suggested, two or three bits should be enough. Somewhere within the forum archive, there was a discussion about this. And optimized prediction/update parameters help, too, when compared against something like p += p>>8 or p>>5.

  11. #11
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 151 Times in 18 Posts
    Nothing special - we just pack our probability and context seen count into one 32-bit integer. Nothing about bit histories or tables...
    Please, just look at the sources more carefully. You can start with the param of function StateMap::p(...) with which table t is addressed. :)

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Code:
    // Initialize assuming low 8 bits of context is a bit history.
    StateMap::StateMap(int n): N(n), cxt(0) {
       alloc(t, N);
      for (int i=0; i<N; ++i) {
        // Count 1 bits to determine initial probability.
        U32 n=(i&1)*2+(i&2)+(i>>2&1)+(i>>3&1)+(i>>4&1)+(i>>5&1)+(i>>6&1)+(i>>7&1)+3;
        t[i]=n<<28|6;
      }
      if (dt[0]==0)
        for (int i=0; i<256; ++i)
          dt[i]=32768/(i+i+3);
    }
    
    // ...
    
    Predictor::Predictor(): cxt(0), sm(0x10000) {
      for (int i=0; i<0x100; ++i)
        state[i]=0x66;
    }

  13. #13
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 151 Times in 18 Posts
    The predictor is called like this:

    Code:
    sm.p(cxt<<8|state[cxt])
    The param is the context from which the probability is retrieved. Its low byte (state[cxt]) is the bit history and the higher bits are its 'order-0-context' (IOW, the already coded bits of the current symbol). Hope this helps.
    Last edited by Christian; 18th May 2008 at 17:08.

  14. #14
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    Quote Originally Posted by toffer View Post
    If you want to benefit from a longer bit history, use a delayed counter, as Shelwien suggested, two or three bits should be enough.
    Checked the archive. Delayed counters seemed to be an interesting subject...

  15. #15
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    1. bit history is a sequence of bits encountered with a given context
    2. fpaq0f2 is pretty much an order1 coder, just with another context instead of previous byte
    3. fpaq0f2 _properly_ switched to order1 should work better than a common order1 coder,
    though I'm not sure whether it'd reach order2 level. Probably would.
    4. Its not a matter of "warming up". When mapping is initialized the right way, a coder
    would have _at_least_ the same compression as without mapping.
    Basically, you make a working order1 coder first, then add a _static_ mapping and ensure
    that compression doesn't significantly change (so mapping is properly initialized),
    then you slowly increase the mapping update rate.
    5. Variable update rate does help. But mainly in higher order contexts.

  16. #16
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    OK, thanks for info!

  17. #17
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    If you want a more cache efficient implementation of some "narrowed" bit history (as i said in some post, increasing further than 2 bits doesn't improve compression that drastically), here's a nibble model (sized 64 bytes ):

    Code:
    ctx4 - nibble context (like paq)
    
    class model4
    {
    protected:
      static const uint N=15*4;
      uint32 states;
      bitmodel bitTree[N];
    
    public:
    
      void Clear();
    
      inline int P1( const uint ctx4 )
      { return bitTree[ctx4*4+(states>>2*ctx4)&3].P1(); }
    
      inline void Update( const int bit, const uint ctx4 )
      {
        Assert( ctx4<15 );
        const uint mask = 3<<2*ctx4;
        uint state = (states&mask)>>2*ctx4;
        Assert( 4*ctx4+state<N );
        bitTree[4*ctx4+state].Update(bit, C, D);
        state = 2*state+bit&3;
        states = (states&~mask)|state<<2*ctx4;
      }
    };
    
    void model4::Clear()
    {
      states = 0;
      for (uint i=0; i<N; ++i) {
        bitTree[i].Set();
      }
    }
    With some clustering this might be practical for order1. I haven't checked it out.

    EDIT: Adding a class which holds the state and a pointer to the active bit might increase the performance a bit.
    Last edited by toffer; 19th May 2008 at 13:26.

  18. #18
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    hi encode,
    Maybe I can give an idea about counters. I'm working on light-weight context mixing (nothing special, just mixing order 0-5 with neural network and 2 dimensional SSE in limited memory - eg. 70mb, no additional sub-model). I have two well used bit model.
    1-p=pr+=((4095-pr)>>5) style
    2-Semi-stationary
    Their characteristics are different. In my SSE counters, I'm using first choice. I'm using second one in literal coding. This gives good results. But, I know, actually they are poor. We must upgrade newer things

  19. #19
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    Its actually completely wrong to just take some known counter sample and use it.
    Counter is basically a standalone model for symbols in context.
    And traditional counters are only good for almost-random bit sequences.
    While looking at the actual data, you may find some patterns.
    Eg. if there're contextual dependencies in the context history, then the right idea
    probably would be adding these history bits to the context (or SSE context),
    not optimizing the counter parameters or something like that.

  20. #20
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Note that I already use something based on bit histories (like previous whole byte, high 3 bits of previous whole byte etc). But, when I changed bit model, compression becomes worser or better. This is the answer why I'm looking for new ideas about counters. I think, SSE needs completely different bit model for it's entries. Maybe different model for each entries (eg. different adaptation speed). But, I'm not sure.

  21. #21
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by osmanturan View Post
    Note that I already use something based on bit histories (like previous whole byte, high 3 bits of previous whole byte etc).
    That's not a bit history. A bit history is the bit sequence you encounter under a context after contextual seperation.

    Quote Originally Posted by osmanturan View Post
    But, when I changed bit model, compression becomes worser or better. This is the answer why I'm looking for new ideas about counters. I think, SSE needs completely different bit model for it's entries. Maybe different model for each entries (eg. different adaptation speed). But, I'm not sure.
    I hope i could help with my last post on the old forum. There are a lot of good threads about couters/SSE. Just have a look at that.

  22. #22
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,239
    Thanks
    192
    Thanked 968 Times in 501 Posts
    > But, when I changed bit model, compression becomes worser or better.
    > This is the answer why I'm looking for new ideas about counters.

    Just think about it. Counter is a model for some subsequence of data bits.
    So you can extract that sequence and make a separate model for it,
    then you'd know the performance limit for a counter working on this sequence.

    > I think, SSE needs completely different bit model for it's entries.
    > Maybe different model for each entries (eg. different adaptation speed).

    SSE has two basic applications:
    1. Probability stabilization. Obviously, any context has less statistics
    than whole data, so it may assign too small probabilities to some rare symbols.
    But improvement from fixing that is usually insignificant.
    Though it might become significant (like in paq) if we'd knowingly use
    a faulty model for some reason.
    2. Context clustering. There're usually too many relevant contexts, so
    we cannot really collect the full dependencies. Instead, we can make
    a few models with different context subsets and combine their predictions.
    But there's this other way too: you can merge statistics of some contexts
    if you know they're similar.
    And the best way to measure the actual similarity of two contexts is
    to compare their histories - contexts with similar histories make similar
    predictions, so they can be merged.
    Then, a probability is a function of a context history, so it can be
    used as a context similarity measure, that's why SSE allows to expand
    contexts and significantly improve compression.
    But there's no reason to think that a better method doesn't exist,
    eg. using directly the context histories.

  23. #23
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by toffer View Post
    That's not a bit history. A bit history is the bit sequence you encounter under a context after contextual seperation.
    What is the exact meaning of contextual seperation?

    Quote Originally Posted by toffer View Post
    I hope i could help with my last post on the old forum. There are a lot of good threads about couters/SSE. Just have a look at that.
    I have read your post before. I have implemented my SSE implementation by your explanation. I think, your explanation is very very clear. Thanks a lot again.

    I found SSE highly sensitive process (context, constant, bit model, prediction preprocessing etc turn SSE into very different characteristics). Whenever I changed anything compression becomes more better or more worser (for example, on valley.cmb varies about +-1 MB!)

Similar Threads

  1. Advanced Huffman Encoding
    By Simon Berger in forum Data Compression
    Replies: 28
    Last Post: 15th April 2009, 14:24
  2. Advanced Lazy Matching
    By encode in forum Data Compression
    Replies: 15
    Last Post: 8th May 2008, 00:29
  3. PREDICTOR algorithm
    By encode in forum Forum Archive
    Replies: 30
    Last Post: 16th February 2008, 18:28
  4. Prob/counter update
    By encode in forum Forum Archive
    Replies: 12
    Last Post: 28th November 2007, 21:34

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •