Results 1 to 11 of 11

Thread: Need C++ range code library

  1. #1
    Member
    Join Date
    Aug 2015
    Location
    Nice
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Need C++ range code library

    Where can I find a good C++ implementation of a binary range coder? What I need basically, is some API that I can use to encode a stream of bits with varying probabilities. For example, let RC be an instance of a range coder, I would like to have something like this:

    RC->encode(probability0, probability1, bit)
    RC->encode(probability0, probability1, bit)
    ...
    RC->getEncodedStream()

    Unfortunately I haven't been able to to find one. If there isn't one for C++, can you please suggest a good one in any other language?
    Thank you

  2. #2
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 199 Times in 147 Posts
    You may look for example at rc_* or fpaq0* . fpaq_sh tested in Entropy Coder Benchmark

    See also the references mentioned or Simple binary range coder
    Edit: unfortunately the link "http://nishi.dreamhosters.com/u/" is no more accessible
    Last edited by dnd; 7th September 2015 at 21:41.

  3. #3
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    803
    Thanks
    244
    Thanked 255 Times in 159 Posts
    Daala has interesting: large alphabet multiplication-free(!) range coder (entenc, entdec):
    https://github.com/xiph/daala/tree/master/src

  4. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    You can use class Encoder and Decoder from libzpaq. They are public domain. The end of the compressed stream is marked by 4 consecutive 0 bytes, which can never appear elsewhere.

  5. #5
    Member
    Join Date
    Aug 2015
    Location
    Nice
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Matt Mahoney View Post
    The end of the compressed stream is marked by 4 consecutive 0 bytes, which can never appear elsewhere.
    How does the encoder make sure that it never outputs 4 consecutive 0 bytes? Do you add an additional 1 bit after 31 consecutive 0s?

  6. #6
    Member
    Join Date
    Jan 2014
    Location
    Bothell, Washington, USA
    Posts
    695
    Thanks
    153
    Thanked 183 Times in 108 Posts
    Quote Originally Posted by coyote View Post
    How does the encoder make sure that it never outputs 4 consecutive 0 bytes? Do you add an additional 1 bit after 31 consecutive 0s?
    Matt could probably answer this best, but it appears the encoder and decoder check for a low value of 0 after shifting encoded bytes and add one to the low value when it is zero.

    For instance, here is the encoder code with a comment at the end about not encoding four 0 bytes in a row:

    // compress bit y having probability p/64K
    void Encoder::encode(int y, int p) {
    assert(out);
    assert(p>=0 && p<65536);
    assert(y==0 || y==1);
    assert(high>low && low>0);
    U32 mid=low+U32(((high-low)*U64(U32(p)))>>16); // split range
    assert(high>mid && mid>=low);
    if (y) high=mid; else low=mid+1; // pick half
    while ((high^low)<0x1000000) { // write identical leading bytes
    out->put(high>>24); // same as low>>24
    high=high<<8|255;
    low=low<<8;
    low+=(low==0); // so we don't code 4 0 bytes in a row
    }
    }

  7. #7
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,572
    Thanks
    780
    Thanked 687 Times in 372 Posts
    btw, replacing "while" in this code with 4 consecutive "if" may improve CPU branch prediction

  8. Thanks (2):

    encode (2nd April 2016),Matt Mahoney (14th September 2015)

  9. #8
    Member
    Join Date
    Jan 2014
    Location
    Bothell, Washington, USA
    Posts
    695
    Thanks
    153
    Thanked 183 Times in 108 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    btw, replacing "while" in this code with 4 consecutive "if" may improve CPU branch prediction
    I assume the "if"s should be nested. Does this help when the probability of the while condition being true is significantly less than 50% for any iteration? I sometimes have code more like this to try to avoid one jump but am not sure it actually helps or is worthwhile:

    // compress bit y having probability p/64K
    void Encoder::encode(int y, int p) {
    assert(out);
    assert(p>=0 && p<65536);
    assert(y==0 || y==1);
    assert(high>low && low>0);
    U32 mid=low+U32(((high-low)*U64(U32(p)))>>16); // split range
    assert(high>mid && mid>=low);
    if (y) high=mid; else low=mid+1; // pick half
    if ((high^low)<0x1000000) { // write identical leading bytes
    do {
    out->put(high>>24); // same as low>>24
    high=high<<8|255;
    low=low<<8;
    low+=(low==0); // so we don't code 4 0 bytes in a row
    } while ((high^low)<0x1000000);
    }
    }

  10. #9
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,572
    Thanks
    780
    Thanked 687 Times in 372 Posts
    the benefit of "if" is separation of code with different probability. if it is much less than 50% for every "while" iteration, it should make no difference

    20+ years ago compilers generated better code for "do while" rather than "while" so replacing it with "if {do while}" can made some snse. i believe that modern compilers are much smarter but not checked it
    Last edited by Bulat Ziganshin; 7th September 2015 at 03:11.

  11. Thanks:

    Kennon Conrad (7th September 2015)

  12. #10
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Kennon Conrad is correct. Also, part 4 of the ZPAQ specification gives the pseudo-code for the decoder. http://mattmahoney.net/dc/zpaq204.pdf

  13. #11
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    803
    Thanks
    244
    Thanked 255 Times in 159 Posts
    For completeness, decoding step of uABS (near Matt's fpaqc):
    uint64_t xp = (uint64_t)x * p;      
    out = ((xp & 0xffff) + p) >> L_pb;
    xp >>= L_pb;
    x = out ? xp : x - xp;
    if (x < (1 << 16)) { x = (x << 16) | *ptr++; } //renormalization

    of rABS:
    uint16_t xfrac = x & 0xffff;
    uint32_t x_new = p * (x >> 16);
    if (xfrac < p) { x = x_new + xfrac; out = 0;}
    else { x -= x_new + p; out = 1; }
    if (x < (1 << 16)) { x = (x << 16) | *ptr++; } //renormalization

    of tABS:
    t = decodingTable[probability][x];
    out = t.symbol ^ AH; // AH - probability above 1/2
    x = t.newX | readBits(t.nbBits);

    benchamarks, sources, discussion: http://encode.su/threads/1821-Asymet...l-System/page8

  14. Thanks:

    Bulat Ziganshin (16th September 2015)

Similar Threads

  1. Replies: 33
    Last Post: 2nd July 2018, 20:18
  2. Hashing a large range of integers to a smaller range
    By mtwaha in forum Data Compression
    Replies: 3
    Last Post: 11th April 2011, 23:27
  3. How fast should be a range coder ?
    By Cyan in forum Data Compression
    Replies: 33
    Last Post: 16th November 2009, 16:02
  4. QuickLZ ZIP - new zip/deflate library
    By Lasse Reinhold in forum Forum Archive
    Replies: 23
    Last Post: 1st October 2007, 22:08
  5. MM compression library
    By Bulat Ziganshin in forum Forum Archive
    Replies: 29
    Last Post: 12th September 2007, 15:40

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •