Page 7 of 9 FirstFirst ... 56789 LastLast
Results 181 to 210 of 251

Thread: Asymetric Numeral System

  1. #181
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 199 Times in 147 Posts

  2. #182
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by Cyan View Post
    ...[skal's] code is basically a modified version of FSE.
    But without any credit.
    ...
    On the flip side, FSE is amply referenced in multiple places on the web. I don't think it could be erased from history at this point.

  3. #183
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    Even skal's name: FSC clearly points the reference ...

  4. #184
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Would someone update the Facebook of Compression with the true history, please!

  5. #185
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by Jarek View Post
    Pascal Massimino (skal) has made his code available: https://github.com/skal65535/fsc


    ps. nburns, there is analytical way to "verify accuracy of a table" as I have done in the paper. For Huffman you just multiply sequence lengths by frequencies, here it is very similar, but first you have to find probability distribution of being in different states - dominant eigenvector of stochastic matrix of jumping between these states.
    That's what you'd have to do to solve the full problem, making the fewest assumptions. I'm looking for other things, too, that might be more straightforward to measure or give you different information.

    Huffman is a very trivial case, because the size does not depend on order, every symbol maps to a single code, and you're only dealing with integers.

    ANS is a much different beast.
    Last edited by nburns; 18th March 2014 at 10:46.

  6. #186
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    Quote Originally Posted by Cyan View Post
    But the case of skal is a bit worse (and the reason why I'm a bit talkative today), since his code is basically a modified version of FSE.
    contact them first. they may just don't know that there is a convention to give credits to authors of original work

    looking at the FSC initial commit i see f.e. the PutBlock finction allowing to encode data in small reversed blocks. i don't keep eye on FSE development, is it also from your code? may be entire FSC is just reimplementation of Jarek's idea from ground up and it is the same as yours only in the things that can't be done other way? or, more probably, he was looked into your code and then reimplemented it from scratch?

    i'm sorry that i don't yet looked into Jarek paper and don't understand what is your ideas and what is from the paper

  7. #187
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    892
    Thanks
    492
    Thanked 280 Times in 120 Posts
    or, more probably, he was looked into your code and then reimplemented it from scratch?
    Yes that's the most likely scenario.

    There are a lot of minor implementation choices which can be done in many different ways, they happen to be identical in the 2 versions.

    I want to make it clear that there is no legal problem in doing so.
    The code is published in BSD license for this reason, so it's meant to be re-used, and adapted, and I'm glad people do it.

    Building on top of someone else's code is fine.
    But the normal and ethical way is to simply name its source, and if possible explain the differences.
    Last edited by Cyan; 5th May 2014 at 13:09.

  8. #188
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    i'm sorry that i don't yet looked into Jarek paper and don't understand what is your ideas and what is from the paper
    I'm not sure if you've grokked the "big picture" yet, but let me take a shot at explaining it, for the benefit of any newcomers, at least:

    ANS is conceptually based on two functions, rank() and reduce_precision(), with inputs and outputs like so:

    S: string of weighted symbols
    r: integer
    rank: S -> r

    i,j: integers
    T,U: stacks
    reduce_precision: i * T -> j * U

    rank() computes a rank for a string of symbols from a weighted alphabet. It can be written as a recurrence:

    Code:
    q,r: integer ranks
    s: symbol
    f: q * s -> r
    
    // f: compute a rank from a rank and an additional symbol
    int f(int r, symbol s) {
      //...
    }
    
    // rank: compute the rank of a string
    int rank(string S) {
      length <- |S|
      if(length == 0) {
        return FIRST_RANK
      }
      else {
        S' <- take_prefix(S, length-1)  // make a new string from first length-1 symbols of S
        s <- last_symbol(S)
        return f(rank(S'), s)
      }
    }
    reduce_precision() removes the least-significant bit of an integer and saves it onto a stack. The result is a shorter integer and a modified stack.

    Code:
    int reduce_precision(int i, stack T) {
      push(low_bit(i), T) 
      j = shift_right(i, 1)
      return j
    }
    If there is no restriction on the size of an integer, the encoding is simply the rank:

    Code:
    // Encoder on an machine with infinite precision:
    print rank(S)
    But as a rule, any interesting string will exceed the size of a machine word. Function f always increases the rank, so we need a way to detect when the rank is about to overflow:

    Code:
    q: integer rank
    s: symbol
    c: boolean
    would_overflow: q * s -> c
    
    would_overflow = true if f(q,s) exceeds a predefined limit, false otherwise
    When a potential overflow is detected, we apply reduce_precision as many times as necessary, then continue with the usual rank calculation (rewritten as a loop):

    Code:
    r: integer rank
    T: stack
    
    r <- FIRST_RANK
    T <- EMPTY_STACK
    while((s = read()) != EOF) {
      while(would_overflow(r, s)) {
         r = reduce_precision(r, T)
      }
      r = f(r, s)
    }
    // the output is the final rank and the contents of the stack:
    print r, T
    This is an implementation of ANS at a high level.

    Since reduce_precision() saves the truncated bits to a stack, and the stack is part of the output, it can be inverted. The f function is also carefully chosen to have an inverse, so the result of the algorithm is decodeable. The size of the output in the overflow-conscious algorithm (ANS) is close to the size of the exact rank(S). rank(x) is chosen to return an integer approximation of the entropy of x, for all x; therefore, ANS achieves compression close to the theoretical bound.

    Since the intermediate rank is the only state maintained by the algorithm, it can be thought of as a finite state machine with a state for every possible rank.
    Last edited by nburns; 14th November 2014 at 20:54. Reason: Clarified

  9. Thanks:

    Bulat Ziganshin (9th July 2014)

  10. #189
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    I based the final encoder pseudocode on what I've seen done, e.g. in fpaqc, for the ANS primer I posted above. After writing it, I gave it some thought, and I realized that you can construct situations that cause that algorithm to fail: you can cause it to wind up in a state that's invalid due to being outside of the range.

    I went back and looked at Jarek's original paper, where he proves the validity of the algorithms, and I found that his proof uses reduce() in a different way than the one I gave here (to use it the proved way, you'd have to do something like jump to a too-large state unconditionally and then reduce back into range.) Has anyone else dealt with this?

    Edit:
    I changed the ANS primer to use the known-invertible algorithm.

    Edit:
    Changed back, reflecting the fact that fpaqc tested OK.
    Last edited by nburns; 10th July 2014 at 03:54.

  11. #190
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Well, I finally got around to trying to demonstrate the problem I described in my last post. Using fpaqc as subject, I exhaustively tested all possible combinations of probability and state. I found no errors in fpaqc.

    I also played around with the predictor, and I seem to have made it faster. Here's the patch:

    --- fpaqc.cpp	2007-12-25 17:45:34.000000000 -0800
    +++ fpaqc/fpaqc.cpp 2014-07-03 22:01:34.677510612 -0700
    @@ -67,8 +67,12 @@
    }

    void update(int y) {
    +#ifndef FASTPRED
    if (y) t[cxt]+=65536-t[cxt]>>5;
    else t[cxt]-=t[cxt]>>5;
    +#else
    + t[cxt] = t[cxt] - (t[cxt] - y >> 5) + (-y & 0x7ff);
    +#endif
    if ((cxt+=cxt+y)>=512) cxt=1;
    }
    };
    Last edited by nburns; 9th July 2014 at 03:19.

  12. Thanks (2):

    Bulat Ziganshin (15th May 2016),Matt Mahoney (9th July 2014)

  13. #191
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    Pascal has added 4x interleaving and alias rANS to his implementation:
    https://github.com/skal65535/fsc

  14. Thanks:

    Cyan (4th November 2014)

  15. #192
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by Jarek View Post
    Pascal has added 4x interleaving and alias rANS to his implementation:
    https://github.com/skal65535/fsc
    I'm having a hard time figuring out how to use it. I'm not sure what he means by words. Are those words in the linguistic sense, or words as in fixed-size chunks of data larger than bytes?

    With no command-line options, it works on enwik7.bwts.mtf, but on enwik8.bwts.mtf and enwik9.bwts.mtf it fails in the table-building routine with a generic error message. Am I misusing it somehow or is that a bug?

  16. #193
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    178
    Thanks
    61
    Thanked 51 Times in 40 Posts
    "words as in fixed-size chunks of data larger than bytes". From the readme file:

    The word-based coding methods (CODING_METHOD_16B and up) will use b=2^16 and write 16b-words at a time. They also have variants with interleaving and alias method.

  17. #194
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by hexagone View Post
    "words as in fixed-size chunks of data larger than bytes". From the readme file:

    The word-based coding methods (CODING_METHOD_16B and up) will use b=2^16 and write 16b-words at a time. They also have variants with interleaving and alias method.
    I think I saw that. So can we conclude that by words he means multi-byte chunks? Part of my point was my feeling that the documentation is unclear, and I still feel that way. More pointedly, I tried throwing some basic test data at it and it failed for reasons that are still obscure. What do I need to do to test with enwik9.bwts?*

    *Note that enwik9.bwts ran fine, but the compression was no different than plain enwik9 and looked to match approximately the 0-order entropy of the whole file (roughly what you'd expect from, e.g., Huffman, using whole-file symbol frequencies). To appropriately compress the BWT(S) it either needs to implement some kind of local adaptation internally or accept something like MTF-transformed data (which failed for enwik8+).
    Last edited by nburns; 6th November 2014 at 09:13.

  18. #195
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    103
    Thanks
    13
    Thanked 54 Times in 35 Posts
    Hi everybody,

    oh, wow, i didn't know 'fsc' was known outside of my conversation with Jarek about ANS! Thanks for your interest.
    To be clear, it's just a brute-force non-adaptive entropy coder experiment, and will probably stay so.
    It was mostly written to quickly experiment with alternate spread-functions ('modulo', 'pack', 'reverse', but
    also Jarek's bucket-sorting one, e.g.), as well as trying to understand ANS underlying principles.

    My apologies to Yann, i didn't realize FSE was his, i initially thought "Cyan" was Jarek's nickname (and FSE
    his code). My bad! Updated the README.md with full credits, as due!

    I think i now have clearer ideas about ANS. I really perceive it as an elegant dynamic system, more than an arithmetic
    coder, that relies on few general principles (like: "state 'x' can encode ln(x) bits of information"). Quite novel and intriguing.

    Some precisions/remarks:

    * so, yes, 'fsc' is a simple tool to experiment. Usage is simple:

    ./fsc < input_file > compressed_file
    ./fsc -d < compressed_file > reconstructed_file
    diff reconstructed_file input_file || die


    * With time, i ended up implementing some of Ryg's and Charles' ideas too:
    - interleaving (2x, 4x)
    - word-based output (and not per-bit): i sticked to 16b because i'm mostly interested in ARM and mobile, where 64b calc is not a good idea...
    - alias method
    - specialized binary version (bANS?) using 32b-based output and 32b precision for probas
    - and a 'pure' arithmetic coding version too, for comparison with the bANS (here too 32b precision, 32b output). Just run './bit_cmp'.
    -...

    * So far i can tell:
    - bANS is not necessarily faster than arith coding
    - arith coding is usually more precise
    - it's hard to improve precision (deltaH) in ANS. One need to fiddle with the spread functions quite some, in sometimes surprising ways...
    - multi-symbol is very easy (and fast!) with ANS, much easier than arith coding. That's a great selling point.
    - interleaving speed-up is yummy too!
    - reverse-direction coding in ANS is a pain. Using block-based coding is ok, but you lose a little bytes at the end of each block to store fine state. And it makes adapting probabilities (and context modeling) harder, resorting to multi-pass block-coding (-> speed penalty).
    - all the tricks (alias, interleaving, 16b-output, spread-tables) are 'baked in' in the final bitstream. Means for instance you can use a coder with 16b-based output on a mobile phone and decode the bitstream with a 64b-based reader. Or use the alias method on decoder but not on the encoder... Some platform tradeoffs are needed i guess.
    - i'm unsure how hardware-friendly ANS is overall. Probably better than Huffman, agreed.

    That's it. i overall quite like ANS and will still be fiddling around as time permits...

    thx!
    /skal

  19. Thanks (2):

    Bulat Ziganshin (14th November 2014),Jarek (13th November 2014)

  20. #196
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    Hi skal,

    1) regarding storing the final state, you can compensate it by storing some information in the initial state.
    For example use initial state L+s_0 where s_0 is the first symbol - it can be decoded when the bitstream has dried out.
    More sophisticated (useful for rANS) is starting from state x=1 and using a few times rANS (or uABS) formulas - until you get to I={L,...,bL-1}.

    Alternatively, we can use this state as a checksum. Fix initial state while encoding, and if there was an error while decoding, we will start wandering randomly - most probably getting to a different state, indicating an error.

    2) Regarding approximated binary case (I usually referred as ABS) - let's look at CABAC M-coder from h.264 ( http://iphome.hhi.de/marpe/download/...arpe_et_al.pdf ):

    R_LPS = RTAB[m][(R > > 6) & 3];
    R_MPS = R - R_LPS;
    if (V < (R_MPS < < BitsLeft))
    {R=R_MPS; value=valMPS;
    Rnorm = (R_MPS > > 8 ) XOR 1;}
    else
    {V = V – (R_MPS < < BitsLeft)
    R = R_LPS; value = !valMPS
    Rnorm = RnormTAB[R_LPS > > 3];}
    R = R << Rnorm;
    BitsLeft = BitsLeft - Rnorm;
    if (BitsLeft < = 0)
    {V = (V < < M_BITS) | read_bits(M_BITS);
    BitsLeft = BitsLeft + M_BITS;}

    in comparison, decoding step for tABS is just (m is quantized probability):
    t = DecodingTable[m][x];
    value = valMPS XOR t.symbol;
    x = t.NewX + read_bits(t.nbBits);

    No branches, no bit-by-bit renormalization, no multiplication, single number state e.g. for SIMD ... direct handling of large alphabet - no binarization.

    However, for extremely accurate applications like ultracompressors, indeed ABS probably does not bring any advantages.

    3) Regarding accuracy for large alphabet.
    For rANS you can get as good as you want using 64 bit.
    For tANS it is mostly a matter of large enough L. Especially such that all symbols have probability above 1/(L sqrt(2)).
    Regarding symbol spread, I have managed to get the tuning: spread using not only quantized probabilities (q_s ~ L * p_s), but also the actual ones (p_s) - see: https://github.com/JarekDuda/Asymmet...SystemsToolkit

    4) Regarding inconvenience of backward encoding for e.g. adaptive situations, it is just a matter of using a buffer: storing a sequence of pairs: (s_i, D_i) where s_i is the i-th symbol, which is from distribution number D_i.
    Then encode this buffer in backward direction (table[x] -> table[D_i][x]), now decoding is straightforward.
    Here is JamesB implementation of order1 rANS: http://encode.su/threads/2017-LzTurb...ll=1#post39920

    5. Regarding comparison with Huffman, tANS requires a bit more memory per L. However, smaller L can bring smaller deltaH due to handling fractional bits - see slide 5 of https://dl.dropboxusercontent.com/u/12405967/ANSsem.pdf
    Big hardware advantage is no need for costly tree building - initialization is much simpler and cheaper.

    Best,
    Jarek
    Last edited by Jarek; 14th November 2014 at 00:17.

  21. #197
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    103
    Thanks
    13
    Thanked 54 Times in 35 Posts
    Jarek,

    thanks for the very subtle hint #1 !!

    I implemented it, and it works well at recovering some of the left-over bytes for each block.
    It's a little bit tricky with interleaving, though. I'll update the fsc code when i'm finished dealing
    with corner cases...

    /skal

  22. #198
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    In the course of trying to better understand ANS (and LaTeX), I used LaTeX to typeset some of the ANS formulas in concise form. Representing them in plain ASCII is terrible, and they can be hard to find in their respective papers. Seeing the formulas by themselves, properly typeset, is somewhat easier on the eyes and is good for reference (and interior decorating). I also played with various different notation and variable naming; you may like my choices, or you may not. I've attached uABS and rANS along with their respective LaTeX sources.

    Please let me know promptly if you find any errors.

    Edit:
    Both are corrected.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	uabs_equations.png 
Views:	173 
Size:	23.1 KB 
ID:	3250   Click image for larger version. 

Name:	rans_equations.png 
Views:	159 
Size:	21.3 KB 
ID:	3253  
    Attached Files Attached Files
    Last edited by nburns; 15th November 2014 at 21:17.

  23. Thanks (2):

    Bulat Ziganshin (15th May 2016),Paul W. (14th November 2014)

  24. #199
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,583
    Thanks
    234
    Thanked 160 Times in 90 Posts
    I was wondering if it is possible to apply also rANS algorithms to guess the bit compression (like fpaq and not fpaqA) and whether it is necessary to encode at the back even the bits!

  25. #200
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    Quote Originally Posted by Nania Francesco View Post
    ( http://encode.su/threads/2078-List-o...mplementations ) I was wondering if it is possible to apply also rANS algorithms to guess the bit compression (like fpaq and not fpaqA) and whether it is necessary to encode at the back even the bits!
    I am not certain I understand the question.
    If you are asking if rANS can be applied to a binary alphabet, sure but there are probably no advantages (interleaving?) - here is Charles' comparison: http://pastebin.com/qqyLNXFA

    Regarding getting rid of backward encoding, it is a very good question.
    I was not able to do it for ANS, but maybe it can be done for a different approach ... ?
    And yes - entropy coding automaton can be also made by approximating arithmetic coding (having forward encoding) - getting e.g. so called quasi arithmetic coding: http://www.ittc.ku.edu/~jsv/Papers/HoV93.qtfull.pdf
    However, as range coding requires two numbers to represent the current state, the number of states of such automata would grow much faster than for ANS (~ L^2) - if we would like a better precision or larger alphabet.
    And so e.g. h.264 uses a different approach: "M-coder" (see listing above).

    If you are asking about wasting bits for storing the final state, we can compensate it by storing some information in the initial state - see my #1 answer 2 posts above.
    Last edited by Jarek; 15th November 2014 at 18:54.

  26. #201
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,583
    Thanks
    234
    Thanked 160 Times in 90 Posts
    I meant that for absurd there would be need a bit version [0 and 1 symbols] (comparing the rANS for multiple symbols) to have the range but a code x (rans) comparable. The code might be this:
    If you are asking if rANS can be applied to a binary alphabet, sure but there are probably no advantages (interleaving?) - here is Charles' comparison: http://pastebin.com/qqyLNXFA
    if ( x < (1<<32) )
    {
    x <<= 32;
    x |= *dword_in_ptr++;
    }

    uint64 xm = x & 4095;

    if ( xm < p0 )
    {
    x = p0 * (x >> 12) + xm;

    p0 += (4096 - p0) >> 5;

    return 0;
    }
    else
    {
    x = (4096 - p0) * (x>>12) + xm - p0;

    p0 -= p0 >> 5;

    return 1;
    }
    but to compress change anything with this version, should I compress in reverse?
    Last edited by Nania Francesco; 15th November 2014 at 19:28.

  27. #202
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    I have just linked you something like that (rABS): http://pastebin.com/qqyLNXFA
    And written that (alone) it probably does not bring any advantages ... (interleaving?)

    However, it indeed can be useful for adding binary symbols to rANS stream ... you can use also the uABS formulas instead (fpaqa/b/c).

  28. #203
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,583
    Thanks
    234
    Thanked 160 Times in 90 Posts
    Thanks, I did not understand only if the encoding is asymmetric as in rANS?

  29. #204
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    The "asymmetric" word of ANS refers to asymmetrization of standard numeral systems: which are optimal for uniform probability distribution of digits.
    If you refer to reverse encoding, the only forward encoded ANS family I know are prefix codes (but I haven't spent much time on looking), which decoder can be seen as a special case of tANS decoder.

    And generally there are lots of materials about ANS ... I have just added more of them to the list: http://encode.su/threads/2078-List-o...mplementations

  30. #205
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    You can reverse the stream either during compression or decompression. fpaqa/b/c and lpaq1a reverse it during compression.

  31. #206
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    103
    Thanks
    13
    Thanked 54 Times in 35 Posts
    About ANS and 1/x distribution:

    i was reading Charles' blog http://cbloomrants.blogspot.fr/2014/...ing-ans-9.html,
    and there's this question about where the 1/x invariant distribution comes from. I've seen it mentioned in Jarek's
    paper too. Here are some hopefully helpful insights and demonstration related to this:


    There's a continuous formulation of the Charles' initial system, that helps understand the 1/x distribution
    (these dynamical systems are known as 'piecewise affine dynamical system').
    The original code for the discrete version would be: http://pastebin.com/NPqmsR8L
    The continuous version of it is here: http://pastebin.com/RVMv1nfq


    It'll show that, whatever the slope 'alpha' you chose, the limiting distribution is always 1/xln(2).


    I've written the mathematical proof (along with some illustrations and explanation) here:


    page 1: Click image for larger version. 

Name:	IMG_20141118_100901.jpg 
Views:	210 
Size:	180.7 KB 
ID:	3263

    page 2: Click image for larger version. 

Name:	IMG_20141118_100912.jpg 
Views:	199 
Size:	197.6 KB 
ID:	3264

    (sorry, no LaTeX yet)

    Basically, 1/x is the only distribution that solves the functional equation: f(x) = a.f(a.x) for any slope a != 1

    /skal

  32. Thanks (2):

    Bulat Ziganshin (15th May 2016),Cyan (18th November 2014)

  33. #207
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    Hi skal,
    So the Pr(x) depends on the symbol spread.
    For example Huffman can be seen as a special case of tANS: when symbols are spread in ranges of 2^k length. In this case we get uniform Pr(x) ~ const instead.
    Generally there is always some divergence from the hyperbolic behavior, but it quickly drops with L for reasonable spreads.
    I have tried to find this divergence in the previous paper ( http://arxiv.org/abs/0910.2724 ), but failed - the details turn out really tough.
    Also for bounding deltaH, where it is essential that sum q_s =1 to make that the linear term vanish in the expansion of Kullback-Leibler, turns out to be only approximate and tough to bound (fortunately current estimates seem to work well) - there is still a lot to do at the theoretical base of ANS ... I comfort myself with the fact that situation for arithmetic coding doesn't look much better after these few decades ...

    So currently I just use simple intuitive explanation like in your note (slide 14 of https://dl.dropboxusercontent.com/u/12405967/ANSsem.pdf ): the transition is
    lg(x) ->~ lg(x) + lg(1/p) mod lg(b)
    having three sources of chaotic behavior: asymmetry, ergodicity, diffusivity (cryptographic applications?) - the ergodicity itself strongly suggests that we should expect approximately uniform distribution for lg(x), what translates into hyperbolic distribution for x.

    Other intuitive explanation is that x contains lg(x) bits of information.
    Situation of probability p contains lg(1/p) bits of information.
    So we should have Pr(x) \propto 1/x

  34. #208
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    103
    Thanks
    13
    Thanked 54 Times in 35 Posts
    So, i'm trying to step back a little and understand why ANS has less precision than say arithmetic coding.
    Let's take the binary symbol case for simplicity.
    I'm considering the case of 16b precision for probabilities (so, L=16), along with 16b-word I/O (that is: one reads or write uint16 only).

    The code is here: http://pastebin.com/FnNK4HDc

    The relative entropy (deltaH = (H-H0)/H0) looks like this (in percents) looks like this,
    when generating 10^7-long binary messages with probability between 0 and 1:

    Click image for larger version. 

Name:	dH.png 
Views:	180 
Size:	9.4 KB 
ID:	3272

    And as one can see, even if relatively OK, the compression of bANS is still quite off compared to arithmetic coding
    with similar precision.

    Is there something i'm doing obviously wrong here? And if not, how to fix that.
    Is there any magic spread-table i should be using in this very particular case of binary coding?

    /skal

  35. #209
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    Hi skal,
    The variant from the code is rABS (range variant of asymmetric binary system, here is Charles': http://pastebin.com/qqyLNXFA ).
    So while uABS (u - uniform) is extremely accurate (Matt's fpaqa/b/c), the range variants are not (it is why they have born so late) ... unless we operate on very large x (there are some estimations in the preprint).

    I see you operate on x \in I = {2^16, ... , 2^32 -1} to work on 32 bit x and renormalize 16 bits at once.
    Basically, there are three ways to improve your accuracy for arithmetic formula and binary alphabet:
    - switch to much more accurate uABS formulas: http://mattmahoney.net/dc/dce.html#Section_33 , or
    - switch to 64bit x, e.g. x \in I = {2^48, ... , 2^64 -1} for 16bit renormalization, or
    - switch to 8bit renormalization, e.g. x \in I = {2^24, ..., 2^32 -1} for 32bit x.

  36. #210
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    819
    Thanks
    253
    Thanked 264 Times in 163 Posts
    My previous post was written in hurry, here is addition after some thought.
    So the problem of big inaccuracy of your code is that the denominator of quantization M (p[s] ~ q[s] / M) and L: the lower end of I = {L, ..., bL-1} (renormalization) are the same (2^16) - like in degenerate case here: http://encode.su/threads/2035-tANS-r...st-of-accuracy
    Range variant is a very crude symbol distribution - in ranges instead of expected uniform. However, it can get as good accuracy as we want: by operating on a very high appearance number of this range ... but in your case it is just the lowest possible: second appearance.

    For range variants of ANS it is essential that the denominator for quantization of probabilities M (p[s] = q[s] / M) is much smaller than L (I = {L, ..., bL-1}) - what means that we work on high appearance of the range.

    To give some numbers, at the beginning of page 9 of the last preprint there is upper bound for rANS inaccuracy:
    |epsilon| < M/x
    where m is the denominator of quantization.
    x >= L
    deltaH ~ sum_s epsilon^2 / p_s
    so like in page 15, we should expect general behavior
    deltaH ~ M^2 / L^2 (* "alphabet size"^2)

    You use huge M=2^16. Just reduce it a bit and you will get much better behavior (or use uABS instead or increase L).
    And just let me if you meet disappointing behavior again - maybe it can be repaired ...

    For binary alphabet you can also use tABS: use a set of coding tables for all quantized binary probabilities (e.g. k/32).
    In the preprint there is written set for 5 and 16 states - allowing to work in deltaH ~ 0.01 or 0.001 bits/symbol regions correspondingly (unless probabilities are tiny - we need large number of states to handle them well).
    Storing the entire set of tables in the second case is only e.g. 16x16 = 256 bytes.
    Last edited by Jarek; 22nd November 2014 at 12:06.

Page 7 of 9 FirstFirst ... 56789 LastLast

Similar Threads

  1. Replies: 39
    Last Post: 10th April 2014, 23:26
  2. Replies: 3
    Last Post: 19th April 2013, 17:33

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •