Results 1 to 8 of 8

Thread: Is ANS only near optimal ( ignoring end byte padding if any ) ?

  1. #1
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    265
    Thanks
    13
    Thanked 0 Times in 0 Posts

    Is ANS only near optimal ( ignoring end byte padding if any ) ?

    Hi :

    Can ANS be made 'optimal' ( ie equates Shannon's exact number of bits )? OR is it already but for end byte padding, multiplicities frequencies recording within Headers etc ?

  2. #2
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,497
    Thanks
    26
    Thanked 132 Times in 102 Posts
    Probably every codec that works on fixed size numbers and does periodical renormalization (i.e. virtually every codec) is not perfectly optimal. You probably would need to operate on big (big precision wise) floating point numbers with size comparable to compressed file size to make the process as close to theory as possible.

  3. Thanks:

    LawCounsels (12th March 2020)

  4. #3
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    803
    Thanks
    244
    Thanked 255 Times in 159 Posts
    ANS and arithmetic coding work on some approximated probability distributions, but can get as close to a chosen distribution as we want by increasing the state, which acts as a buffer for fractional bits.
    In ANS we get ~4 times smaller redundancy (used additional bits/symbol) for each doubling of the number of states. In practice, using ~8x more states than alphabet size we are operating on level of inaccuracy of estimation.

    ANS additionally needs to store the final encoder state, however, we can compensate it by storing some information in the initial encoder state.

  5. #4
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    265
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Can ANS be made exact same bits as Shannon's limit , using Arbitrary Precision Integers etc etc ? ( ignore any end byte padding, recording multiplicities frequencies in Header Headers etc etc ) ?

  6. #5
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    803
    Thanks
    244
    Thanked 255 Times in 159 Posts
    Huffman is exact - for sources with power of 2 probabilities.
    In rANS we approximate probabilities with 1/2^m denominator, and then there is still some approximation - it rather never is perfect, but you can easily get negligible e.g. 10^-10 bits/symbol redundancy.

  7. #6
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    265
    Thanks
    13
    Thanked 0 Times in 0 Posts
    With 10^-10 , presumably many many ( vast majority ) input files give exact same number of bits as Shannon's ?

  8. #7
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    803
    Thanks
    244
    Thanked 255 Times in 159 Posts
    Using 8x more states than alphabet size you get ~10^-3, 10^-4 bits/symbol more ... and it reduces ~4x every doubling of number of states.
    In rANS using 32bit state for 8bit alphabet, it gives you ~10^-20 bits/symbol accuracy ... you can use 64bit if it is not enough.

    Anyway, inaccuracy of AC/ANS is rather negligible in practice - what counts are better predictions, transformations, statistical models ...

  9. #8
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,497
    Thanks
    26
    Thanked 132 Times in 102 Posts
    Inefficiencies are usually very small. Look at old aridemo from Eugene: http://compression.ru/sh/downl_e.htm http://compression.ru/sh/coders6a.rar

    Results:
    Code:
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ������� ����     17,390,634
    AriDemo Model:   o0c_v2a.inc
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
                 CodeSize   C-Time   D-Time
    
    CL-D       10,680,328    9.66    11.47
    CL-R       10,680,668    5.55     7.08
    CL-Rf      10,680,351    5.44     6.70
    CL-Rfm     10,680,327    6.86     8.62
    Subbotin   10,682,917    5.55     6.92
    Subb-LB    10,682,917    5.55     6.97
    Shindlet   10,680,348    5.77     6.48
    Shcoder    10,680,642    6.97     8.34
    Ari        10,680,318   12.47    14.06
    
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Same probabilities, different arithmetic / range coders that work on fixed size number formats.

Similar Threads

  1. paq8 on low end hardware
    By kaitz in forum The Off-Topic Lounge
    Replies: 0
    Last Post: 31st August 2019, 18:38
  2. Image compression and end-user display resolution
    By SolidComp in forum Data Compression
    Replies: 3
    Last Post: 2nd February 2019, 10:28
  3. Another ANS opportunity ?
    By boxerab in forum Data Compression
    Replies: 0
    Last Post: 10th August 2017, 16:58
  4. ELI5: FSE/ANS
    By cassini in forum Data Compression
    Replies: 10
    Last Post: 28th September 2016, 22:53
  5. Which ANS is the Best
    By calthax in forum Data Compression
    Replies: 8
    Last Post: 9th September 2014, 16:13

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •