Results 1 to 6 of 6

Thread: Assessing compression library reliability

  1. #1
    Member
    Join Date
    May 2015
    Location
    Israel
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Assessing compression library reliability

    At my workplace we need to use a compression algorithm for some project. The requirements are same as any - compression performance vs. cpu utilization, suitable api and so on.
    But now I was asked to assess the reliability of available libraries.

    One option is looking at commits - snappy for example has no commits in the past 2 years, so I can consider it stable and reliable. LZ4 on the other hand has recent bug fixes,
    and that to some people means it is still not reliable enough.
    Another option is looking at how many and who is using each library. But the data is sketchy, and many projects support/use multiple libraries.

    I'm looking for more ideas about how to assess libraries' reliability as a factor in selecting the best fit. If I could "prove" that they are all reliable enough for use in a product, I can move on to the more technical aspects in making the decision.

  2. #2
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    412
    Thanks
    38
    Thanked 64 Times in 38 Posts
    @dbbd: welcome in the encode-forum

    i think - first decide

    what you have

    -- low cpu or high cpu ressources or GPU-ressources

    and what you want

    -- compression fast or very good ?
    -- decompression must be very fast (= often decompressing?) or for archiv-purposes (= decompressing not often)

    and then look further


    out of the box i propose you to look and test:

    http://www.zlib.net/

    - stable, patentfree and "very often used in projects"

    http://www.oberhumer.com/opensource/lzo/

    - stable, patentfree (GPL v2+) - long tested - small code - small cpu requirements
    - very portable written in ANSI C
    - fast compression and *extremely* fast decompression.

    http://www.7-zip.org/

    - stable, patentfree - 7z - very good compression - LZMA2 - SDK - higher cpu requirements

    http://libbsc.com/

    - stable
    - but need further testing and modification for compiling - not easy ..
    - very innovativ - especially if you have GPU-resources very very fast ...
    - i think especially compression modes st5 and st6 are interesting

    best regards

  3. Thanks:

    Bulat Ziganshin (21st May 2015)

  4. #3
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    889
    Thanks
    483
    Thanked 279 Times in 119 Posts
    xxxxxx for example has no commits in the past 2 years, so I can consider it stable and reliable
    Having no commits looks like a dangerous proxy to judge reliability.
    Many open source projects are not updated for extended period simply because they are dead.
    It doesn't mean they are reliable.

  5. Thanks:

    JamesB (28th May 2015)

  6. #4
    Member just a worm's Avatar
    Join Date
    Aug 2013
    Location
    planet "earth"
    Posts
    96
    Thanks
    29
    Thanked 6 Times in 5 Posts
    I agree with Cyan that there are many dead projects (especially open source projects). But beeing dead isn't necessarily a clear sign of beeing reliable or beeing not reliable. On the one hand it's stable and on the other hand it will become incompatible with new versions of operating systems. But, I think that the availability of the source code (or a detailed documentation how it works) and the terms to use it also affects the reliability in the sence that you can do slight changes yourself if the original autor gives up his work. It could also be ported to a different/new architecture in the future.
    Last edited by just a worm; 21st May 2015 at 22:25.

  7. #5
    Member
    Join Date
    Jul 2013
    Location
    United States
    Posts
    194
    Thanks
    44
    Thanked 140 Times in 69 Posts
    Quote Originally Posted by dbbd View Post
    But now I was asked to assess the reliability of available libraries.

    One option is looking at commits - snappy for example has no commits in the past 2 years, so I can consider it stable and reliable.
    Like Cyan mentioned, this doesn't necessarily mean it is reliable. Based on a message posted to the Snappy mailing list (which I feel a lot of people missed), it seems more that the reason Snappy hasn't seen any recent commits is that it is hard for google to maintain the open-source version of Snappy.

    Quote Originally Posted by dbbd View Post
    I'm looking for more ideas about how to assess libraries' reliability as a factor in selecting the best fit. If I could "prove" that they are all reliable enough for use in a product, I can move on to the more technical aspects in making the decision.
    I think the only way you're going to prove that something is reliable enough for you is to test it. I've caught lots of bugs in the unit tests for Squash (as well as a few from static analyzers and fuzzers) and running the Squash Benchmark, and I actually feel fairly comfortable with all the plugins which aren't currently disabled (zling and doboz), though I'll feel much better after I get around to fuzzing them all. IMO the best you can do is just throw everything you can at the library you're interested in and see if it works. Run it in valgrind, asan, tsan, ubsan, etc. Fuzz it, feed it random data, tiny buffers, huge buffers, truncated data, etc.

  8. #6
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    506
    Thanks
    187
    Thanked 177 Times in 120 Posts
    You can try American Fuzzy Lop too (http://lcamtuf.coredump.cx/afl/) to mutate input data and slowly trawl through as many code paths as possible, looking for bugs.

    You may need to disable CRC and similar checks though to help it identify problems elsewhere.

Similar Threads

  1. Bled - BusyBox Library for Easy Decompression
    By Mangix in forum Data Compression
    Replies: 2
    Last Post: 27th December 2014, 11:02
  2. Replies: 5
    Last Post: 23rd July 2012, 16:44
  3. Google released Snappy compression/decompression library
    By Sportman in forum Data Compression
    Replies: 11
    Last Post: 16th May 2011, 12:31
  4. QuickLZ ZIP - new zip/deflate library
    By Lasse Reinhold in forum Forum Archive
    Replies: 23
    Last Post: 1st October 2007, 22:08
  5. MM compression library
    By Bulat Ziganshin in forum Forum Archive
    Replies: 29
    Last Post: 12th September 2007, 15:40

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •