Results 1 to 15 of 15

Thread: Compression codecs that use floating point math

  1. #1
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts

    Compression codecs that use floating point math

    Hi all – Are there any codecs that use floating point? When I looked at various codecs' repositories a while back, I didn't find any floats or doubles. What are some codecs that use floating point?

    How do arithmetic coders handle their values internally?

    Thanks.

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,977
    Thanks
    296
    Thanked 1,304 Times in 740 Posts
    C/C++ compilers are broken in regards to consistent float-point arithmetics.
    And now cpus are broken too: https://stackoverflow.com/questions/...g-point-errors

    Still, its possible to use floats in asymmetric algorithms (eg. optimization of LPC coefs) or lossy compression.

    > How do arithmetic coders handle their values internally?

    Normally AC and ANS just work with integers.
    There're some experimental rangecoder implementations using floats,
    but they're not reliable and don't provide any benefits.

  3. Thanks:

    SolidComp (9th June 2020)

  4. #3
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    44
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Some proof-of-concept coders use floats and even doubles. E.g.:

    /* Dynamic Markov Compression (DMC) Version 0.0.0
    *
    *
    * Copyright 1993, 1987
    *
    * Gordon V. Cormack
    * University of Waterloo
    * cormack@uwaterloo.ca
    */
    /* This program implements DMC as described in
    *
    * "Data Compression using Dynamic Markov Modelling",
    * by Gordon Cormack and Nigel Horspool
    * in Computer Journal 30:6 (December 1987)
    *
    * It uses floating point so it isn't fast. Converting to fixed point
    * isn't too difficult.
    */

    Or

    /* Arithmetic coding routines by A.C. Popat, 1990. For theory, see
    * * Popat, Ashok C., "Scalar Quantization with Arithmetic Coding,"
    * * SM thesis, MIT Dept. of Elec. Eng. and Comp. Sci., Cambridge,
    * * Massaachusetts, June 1990.
    */

    I would say if your compressor uses some form of adaptive quantization it tends to utilize FP math.


    Or

    // Dasher v. 1.6
    // Copyright (C) 2001 (David Ward djw30@cam.ac.uk)

    Or PPMY (no idea who that is from)

    Doubles through out.



    The experiments before paq (e.g. p12, p5, p6) have double probabilistic "neurons".

    There is an MQ-coder implementation which uses doubles all through.

    That's about storage in FP, there are too many projects doing log2 in FP to mention.

  5. Thanks:

    SolidComp (13th June 2020)

  6. #4
    Member
    Join Date
    May 2019
    Location
    Japan
    Posts
    27
    Thanks
    4
    Thanked 8 Times in 4 Posts
    I recently found a data corruption bug in the GeCo3 compressor ( https://github.com/cobilab/geco3/issues/1 ). They fixed if by refactoring some of the floating point math code. (After that they also decided to remove "-Ofast" gcc switch from their makefile. A good decision, since "-Ofast" implies "-ffast-math".)

    Although the compressor now passes my tests, I'm still not sure if it's safe to use it. The mechanism by which input file name was related to reproducible data corruption is still unknown, so there may be still other issues (I suspect memory safety problems).

    Another case showing that extreme care is needed when using floating point math in reproducible experiments.

  7. #5
    Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    289
    Thanks
    10
    Thanked 34 Times in 22 Posts
    OptimFROG uses fp-math, also SAC, they are both lossless

  8. #6
    Member
    Join Date
    May 2017
    Location
    Germany
    Posts
    88
    Thanks
    55
    Thanked 41 Times in 25 Posts
    Quote Originally Posted by Shelwien View Post
    C/C++ compilers are broken in regards to consistent float-point arithmetics.
    And now cpus are broken too: https://stackoverflow.com/questions/...g-point-errors
    I understand
    1. compilers have flags to perform "unsafe" optimization of floating-point expressions ("fast math")
    2. with the x87 FPU, intermediate rounding makes normal C programs practically unpredictable
    3. libraries implement their functions differently (including the C runtime of your choice)

    But could you please elaborate on the CPU part? I was under the impression that math should be fully reproducable even on different CPUs, as long as
    1. you don't enable lossy compiler optimization
    2. it uses single- or double-precision, not 80-bit floats
    3. it uses SSE/AVX registers instead of the FPU stack
    4. it does not rely on libraries, i.e. only add/sub/mul/div
    5. you don't mess with FPU flags like flush-to-zero or rounding mode
    (The linked question was solved with correct compiler settings. Where does the CPU come into play?)

  9. #7
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,977
    Thanks
    296
    Thanked 1,304 Times in 740 Posts
    > compilers have flags to perform "unsafe" optimization of floating-point expressions ("fast math")
    > with the x87 FPU, intermediate rounding makes normal C programs practically unpredictable
    > libraries implement their functions differently (including the C runtime of your choice)

    1) Recently compilers try to avoid using FPU at all - instead they use SSE/AVX code,
    which works with element's precision. However due to different optimizations it
    might work differently on various platforms (Intel/AMD etc).

    2) In FP, even a+b==b+a is not true. Unfortunately it wasn't a concern during early
    development of compilers, so now its very hard to fix.
    Just count the number of proposed workarounds here: https://www.nccs.nasa.gov/images/Flo...onsistency.pdf
    Note that even after all that, they still can't guarantee the same results everywhere,
    just limited error.

    > But could you please elaborate on the CPU part?
    > I was under the impression that math should be fully reproducable even on different CPUs, as long as
    > you don't enable lossy compiler optimization

    It was a speculation in https://stackoverflow.com/a/968460/395609 ,
    based on information which turned to be false (results did change
    in runtime, but it was due to different alignment of dynamically allocated blocks).

    But with all the Spectre bugs, I'd not be surprised if FP consistency
    is not always preserved even on CPU level, because of instruction reordering etc.

    > (The linked question was solved with correct compiler settings. Where does the CPU come into play?)

    They weren't solved in the sense of exactly same binary results everywhere,
    just error within bounds.

    In any case, I did stumble on this issue in my own projects, and wasted a lot of time on software workarounds -
    up to using a custom wrapper class for floats with volatile variables everywhere - it still wasn't 100% safe at the time,
    and turned out not future-proof (newer compiler versions ignore my workarounds).
    Thus I think that the only way to get consistent result is to not use FP at all.

  10. #8
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Brotli quality 10 and 11 do floating point on encoding side for entropy estimation.

    Zopfli does floating point for entropy estimation.

    JPEG XL reference implementation is fully 32-bit floating point (r, g, b, a) are 4x 32-bit floating point number during the decompression. (Say goodbye and farewell to banding artefacts, and welcome to maxed out wide gamut and HDR!)

  11. #9
    Member
    Join Date
    Apr 2015
    Location
    Greece
    Posts
    107
    Thanks
    37
    Thanked 29 Times in 20 Posts
    Entropy estimation and lossy codecs do not count. Rounding errors don't have any correctness effect.

  12. #10
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    Brotli quality 10 and 11 do floating point on encoding side for entropy estimation.

    Zopfli does floating point for entropy estimation.

    JPEG XL reference implementation is fully 32-bit floating point (r, g, b, a) are 4x 32-bit floating point number during the decompression. (Say goodbye and farewell to banding artefacts, and welcome to maxed out wide gamut and HDR!)

    Wow, I thought 16-bit floats per channel were the max visual benefit for humans. I don't know if that counts HDR or not, but I thought JPEG-XR and OpenEXR used 16-bit.

  13. #11
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    16 bit is usually enough, but unsigned 16 bit has some issues when one wants to do image processing with it -- one would only have 11-12 bits of actual dynamics and that can already start causing issues. However, the 32-bit float was chosen because of better support in streaming instructions on a large variety of platforms. Despite of this and being YUV444 JPEG XL decoding is still about 2x faster than other modern codecs.

  14. #12
    Member
    Join Date
    May 2017
    Location
    Germany
    Posts
    88
    Thanks
    55
    Thanked 41 Times in 25 Posts
    Quote Originally Posted by Shelwien View Post
    1) […] However due to different optimizations it might work differently on various platforms (Intel/AMD etc).
    I just don’t think so. For example, the racing game Live for Speed stores its record times as a sequence of player inputs and uses it during replay to re-run the original simulation of the lap. It does so since the early 2000s, through dozens of updates. While there are occasional problems on overclocked/overheated CPUs, I have yet to hear of failures specific to CPU models. Games are chaotic systems; if CPUs were non-deterministic with floats, what Live for Speed does would be impossible.

    Quote Originally Posted by Shelwien View Post
    2) In FP, even a+b==b+a is not true. Unfortunately it wasn't a concern during early development of compilers, so now its very hard to fix.
    I know it for (a+b)+c vs. a+(b+c), but with only two numbers?! How is that?! IEEE754 guarantees 0.5 ULPs of accuracy, and even if the mathematical result happens to lie exactly halfway between two float representations, ties-to-even takes effect.
    (The situation is different for min/max. These are actually specified as max(a,b)!=max(b,a) if one of them is NaN.)

    Regarding your project: Yeah, C++ made some pretty bad mistakes in the way it treats volatile, and compiler vendors are definitely at fault pushing their fast math options. It’s incredibly hard, but I still wouldn’t blame the CPU. I’ve been in CAD and we had a tough time with floating-point peculiarities (like splitting a line segment in the middle creating just one new line instead of two), but our computations were 100% reproducable between CPUs, even without volatile – as long as we used the correct compiler options and sticked to the same math library.

    Quote Originally Posted by Jyrki Alakuijala View Post
    However, the 32-bit float was chosen because of better support in streaming instructions on a large variety of platforms.
    This. 32-bit floats can perform 23-bit integer math perfectly, but often at a higher throughput and wide SIMD availability across platforms.
    If you are purely compute-limited, you may profit from an increased number of registers as well. Branching is often harder, though.

  15. #13
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,977
    Thanks
    296
    Thanked 1,304 Times in 740 Posts
    > It does so since the early 2000s, through dozens of updates.
    > While there are occasional problems on overclocked/overheated CPUs,

    Well, as example: https://nfs.fandom.com/wiki/Motor_Ci.../Patch_History
    "Fixed a bug which caused replays to behave erratically."
    "Replay no longer experience strange camera bumpiness. Note, this will break existing replay files."

    Basically you're exaggerating its consistency,
    and their developers probably designed a replay system which doesn't rely on FP consistency,
    because these games had to run on completely different platforms.

    > I have yet to hear of failures specific to CPU models. Games are chaotic systems;
    > if CPUs were non-deterministic with floats, what Live for Speed does would be impossible.

    Well, its certainly much less reliable than you think: https://stackoverflow.com/a/3144126/395609
    Games are simply more robust than lossless compression algorithms.

    > I know it for (a+b)+c vs. a+(b+c), but with only two numbers?! How is that?!

    1) https://godbolt.org/z/QHZUUm (7FD39BD2+7F800001=7FD39BD2 7F800001+7FD39BD2=7FC00001)

    2) What if "a" and "b" have different FP types (eg. float and double)?

    3) "When Floating-Point Addition Isn't Commutative" https://dl.acm.org/doi/pdf/10.1145/156301.156303

    > IEEE754 guarantees 0.5 ULPs of accuracy, and even if the mathematical result happens to lie exactly

    1) Yes, newer (2008+) version of IEEE-754 is more strict, but its also too new -
    there's still plenty of hardware from before that.

    2) "Many real-world implementations of the architectures nearly
    support IEEE754, but with caveats like not having the full set of NaNs,
    forcing denorms to zero, errors of an ULP or two in multiplication /
    division results, having multiplication differ by an ULP or two depending
    on operand order etc. So "99% of CPUs are IEEE754 compliant" needs a
    disclaimer - the spirit is true and for the purposes of the question you
    are correct, but in general the devil is often in the detail. More like,
    99% of CPUs are 99% IEEE754 compliant."

    > It’s incredibly hard, but I still wouldn’t blame the CPU.

    I'm not "blaming CPUs". I just stumbled on this idea when googling stuff for this thread,
    and it seemed fairly realistic - with all the relatively recent out-of-order/speculative
    CPU optimizations and Spectre bugs (which also appear mostly due to speed optimizations in the cpu).

    > our computations were 100% reproducable between CPUs,

    At least transcendental functions certainly have different implementations on different CPUs.
    https://stackoverflow.com/a/13102431/395609

  16. #14
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Those interested in the deeper issues of floating point representations and alternative number systems might be interested in unums and posits: https://en.wikipedia.org/wiki/Unum_(number_format)

  17. #15
    Member
    Join Date
    May 2017
    Location
    Germany
    Posts
    88
    Thanks
    55
    Thanked 41 Times in 25 Posts
    Quote Originally Posted by Shelwien View Post
    Well, as example: https://nfs.fandom.com/wiki/Motor_Ci.../Patch_History
    "Fixed a bug which caused replays to behave erratically."
    "Replay no longer experience strange camera bumpiness. Note, this will break existing replay files."
    Is there any comment on the root cause? Live for Speed had similar issues, but they were programming errors, like inadvertedly changing the random number generator for wind. But this would have happened with integer math or bignum or anything else just as well!

    > I have yet to hear of failures specific to CPU models. Games are chaotic systems;
    > if CPUs were non-deterministic with floats, what Live for Speed does would be impossible.

    Well, its certainly much less reliable than you think: https://stackoverflow.com/a/3144126/395609
    Again, that post does not say anything about the root cause – in fact, floating-point numbers are not even mentioned in the thread?!

    ————

    > I know it for (a+b)+c vs. a+(b+c), but with only two numbers?! How is that?!

    1) https://godbolt.org/z/QHZUUm (7FD39BD2+7F800001=7FD39BD2 7F800001+7FD39BD2=7FC00001)
    I see, you’re trying to say that one version of the code will later raise an additional floating-point exception (because it results in a signalling NaN) and the other won’t (because it results in a quiet NaN)? That’s a nasty situation indeed, but from my understanding, at least Visual C++ will mind that and won’t re-order the operands or instructions under /fp:strict:
    The compiler does not perform algebraic transformations on floating-point expressions, such as reassociation or distribution, unless the transformation is guaranteed to produce a bitwise identical result.
    ————

    2) What if "a" and "b" have different FP types (eg. float and double)?
    You mean the issue described here and here?
    Regardless of the value of FLT_EVAL_METHOD, any floating-point expression may be contracted, that is, calculated as if all intermediate results have infinite range and precision (unless #pragma STDC FP_CONTRACT is off)
    I didn’t know that, thanks! Like above, it is controlled by compiler settings.

    ————

    3) "When Floating-Point Addition Isn't Commutative" https://dl.acm.org/doi/pdf/10.1145/156301.156303
    Good paper, but from my understanding their main issue is „When does rounding take place?“, and you generally won’t have this issue when compiling for SSE/AVX (except when mixing different precisions per above) because every operation rounds to register width?

    ————

    At least transcendental functions certainly have different implementations on different CPUs.
    https://stackoverflow.com/a/13102431/395609
    Like I said: With the FPU you’re screwed anyway. SSE/AVX does not have transcendental instructions, so there is nothing to worry as long as you stick to the proper library.

    ————

    > IEEE754 guarantees 0.5 ULPs of accuracy, and even if the mathematical result happens to lie exactly

    1) Yes, newer (2008+) version of IEEE-754 is more strict, but its also too new -
    there's still plenty of hardware from before that.

    2) "Many real-world implementations of the architectures nearly
    support IEEE754, but with caveats like not having the full set of NaNs,
    forcing denorms to zero, errors of an ULP or two in multiplication /
    division results, having multiplication differ by an ULP or two depending
    on operand order etc. So "99% of CPUs are IEEE754 compliant" needs a
    disclaimer - the spirit is true and for the purposes of the question you
    are correct, but in general the devil is often in the detail. More like,
    99% of CPUs are 99% IEEE754 compliant."
    I have little experience with non-x86 FP handling, so I assume you’re right. But at least x86 with SSE/AVX and any modern compiler shouldn’t need that disclaimer.

Similar Threads

  1. Meltdown and Spectre impacts on modern compression codecs
    By SolidComp in forum Data Compression
    Replies: 24
    Last Post: 20th June 2020, 03:18
  2. Replies: 14
    Last Post: 27th February 2020, 00:55
  3. Floating point compression
    By entropy in forum Data Compression
    Replies: 12
    Last Post: 9th June 2018, 22:40
  4. Optimized compression codecs on Intel's Clear Linux
    By SolidComp in forum Data Compression
    Replies: 4
    Last Post: 11th March 2017, 01:58
  5. New Compression Codecs Risk Making Zlib Obsolete
    By dnd in forum Data Compression
    Replies: 7
    Last Post: 19th January 2016, 19:01

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •