Results 1 to 5 of 5

Thread: Encoding FLAC residuals (alternatives to Rice encoding)

  1. #1
    Member
    Join Date
    Jan 2020
    Location
    New Haven, CT
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Encoding FLAC residuals (alternatives to Rice encoding)

    I've been wanting to work on a pet project to basically re-implement FLAC (lossless audio codec). It seems like at a very high level there is a predictor, then a coding applied to compress the residuals (error) of the predictor. While FLAC uses Rice coding to compress these, I noticed that the format reserves a space for extra types of codings.

    So I was thinking to make this more interesting, then maybe we can compress these codes with Huffman, arithmetic coding, etc. and see what that looks like. Recently I have been made aware of various ANS implementations (FSE, etc.) and their success - maybe even more interesting to implement, and could maybe allow for better compression ratios.

    However, I'm wondering if there exists some sort of corpus for FLAC residual bitstreams or if the probability distribution of these residuals (for most music/audio files) is known to any extent, which would inform whether or not ANS/arithmetic coding would bring any compression ratio benefits in the first place. According to the FLAC format these numbers are typically small, which is why Rice is efficient, but I can't seem to find anything about their distributions.

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,714
    Thanks
    271
    Thanked 1,185 Times in 656 Posts
    You can compare FLAC compression with results of other lossless audio codecs which use arithmetic coding, like Monkey's Audio or optimfrog
    (though the latter likely has a better predictor too).
    Based on squeezechart wav stats:
    Code:
    SAC  21324487+65919345+20771857+22759276+25340833+52216710 = 208332508
    OFR  21394272+65905363+20880497+23179339+25262790+52433236 = 209055497
    MAC  21773012+66797828+21499324+23246616+25823788+52855336 = 211995904
    FLAC 22369503+69547135+22708163+24927481+26791804+54394889 = 220738975
    
    (1-211995904/220738975)*100 = 3.96%
    (1-208332508/220738975)*100 = 5.62%
    You can expect about 4% compression improvement over FLAC by using better entropy coding, and another 1.5% by using too much computing time.

    In any case, residuals for testing should be easy to acquire since flac is open-source.

  3. #3
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    736
    Thanks
    230
    Thanked 233 Times in 143 Posts
    Residuals are typically from Laplace distribution centered in zero: rho(x) = exp(-|x|/b) / 2b
    Here is penalty of using Golomb comparing to accurate entropy coding, strongly depends on width b:

    Click image for larger version. 

Name:	omYowUT.png 
Views:	34 
Size:	161.5 KB 
ID:	7263

    The difficulty is choosing this b based on context, also adapt such model - here are some techniques: https://arxiv.org/pdf/1906.03238

    In short, MLE estimation is just: b = average of |x|
    For context dependence you need to predict b e.g. using some linear combination of functions of context, or neural networks to minimize: (b(context) - |x|)^2
    Simplest adaptivity is replacing this average with exponential moving average- use update step: b <- mu * b + (1-mu) |x|

    In practice you can have prepared entropy coder for some discretization of b, encoder and decoder need to choose the same.
    Here was supposed to be discussion about it: https://encode.su/threads/3124-LOCO-...ny-alternatves

  4. Thanks:

    introspec (16th January 2020)

  5. #4
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 198 Times in 147 Posts
    Probably, it is simple to insert file write statements in the FLAC file output functions.
    Write the integer values into a raw 32 bits format for better testing.

    Possible methods for encoding with entropy coding:
    1 - Encode the rice coding output with a bitwise range coder like Turbo-Range-Coder instead of bit I/O.
    You can use (length limited) rice coding for the quotient and code the remainder with bit i/o.

    2 - Encode the residuals with variable length coding using a range coder, Huffman or ANS.
    - Gamma Coding with RC: You can test this with the gamma coding in Turbo-Range-Coder
    ​ by writing the integer values into a raw 32 bits file.
    - Golomb/Rice Coding with RC: Encode the quotient with RC like in RC gamma coding and encode the remainder with bit i/o
    - Limited Length Golomb/Rice Coding: use the same golomb parameters like in FLAC.
    Encode the length limited part (ex. 8 bits) with a fixed length entropy coder (RC, Huffman or ANS) and the remainder with bit i/o

  6. #5
    Member
    Join Date
    Jan 2017
    Location
    Germany
    Posts
    63
    Thanks
    31
    Thanked 14 Times in 11 Posts
    What I know:
    In the lossless audio codec MPEG-4 ALS an algorithm named Block Gilbert Moore Coding (BGMC) is used to code residuals. There is a PDF document that provides some information about it.

  7. Thanks:

    Bulat Ziganshin (17th January 2020)

Similar Threads

  1. Nibble entropy encoding
    By JamesB in forum Data Compression
    Replies: 4
    Last Post: 19th October 2018, 18:28
  2. Encoding question
    By cassini in forum Data Compression
    Replies: 13
    Last Post: 5th October 2016, 11:10
  3. Replies: 38
    Last Post: 27th April 2016, 19:01
  4. Codebok encoding
    By kredens in forum Data Compression
    Replies: 1
    Last Post: 29th October 2015, 09:57
  5. Advanced Huffman Encoding
    By Simon Berger in forum Data Compression
    Replies: 28
    Last Post: 15th April 2009, 15:24

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •