Results 1 to 12 of 12

Thread: Dropbox DivANS (nibble)

  1. #1
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    737
    Thanks
    230
    Thanked 233 Times in 143 Posts

  2. Thanks (4):

    Gonzalo (19th June 2018),hexagone (19th June 2018),SolidComp (21st June 2018),Stephan Busch (20th June 2018)

  3. #2
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    752
    Thanks
    216
    Thanked 283 Times in 165 Posts
    Note, that there is a 5x decoding speed drop compared to brotli due to more thorough context modeling.

  4. #3
    Member
    Join Date
    Feb 2015
    Location
    United Kingdom
    Posts
    159
    Thanks
    24
    Thanked 68 Times in 39 Posts
    Neat, but that decode speed is very off-putting, so I think I'll stick with good old brotli since it's nice and speedy.
    From the looks of it it's reading the raw variable dump of any input codec then just applying a stronger context model to it's output. However this won't equate to large gains unless they can control the parser of the input and tune it for their models, which I don't see anything like that in their source code. From experience the parser has a larger effect than a more complex encoder for lz77. The difference for greedy parsing versus optimal parsing on text typically equates to a 10% gain in compression, whereas a stronger entropy coder usually makes up less than 1% of overall compression. Adding better literal models helps, but when changing the entropy coder the parser will be blind to the possible shorter paths available with the new fancy context models. Just my two cents.

  5. #4
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    486
    Thanks
    169
    Thanked 166 Times in 114 Posts
    I find that chart confusing (maybe deliberately so). Basically the Brotli file is 1.86% larger than the DivANS one. If size is everything then sure that matters, but pragmatically it's not enough to justify a huge speed penalty. We have the size difference being shown as unusually significant due to the way it is shown relative to something else, and the speed being downplayed by switching to a log scale.

    Still, it's great to see new tools.

  6. #5
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    239
    Thanks
    95
    Thanked 47 Times in 31 Posts
    Aren't Kraken and other Oodle compressors significantly better than this in both ratio and decode speed? Why don't they just buy the Oodle codecs instead of paying exorbitant Bay Area developer salaries to achieve tiny incremental improvements to zlib and brotli?

    Relatedly, I've been confused about LZTurbo lately. Is it the best? It sure looks like the best. It looks like it might be better than the Oodle codecs, but I don't think I've ever seen them tested against each other. I'm vaguely aware that there's some kind of drama with LZTurbo and Bulat claiming some of the code is his or something, but I'm hazy on the details. In any case, LZTurbo looks to be way ahead of just about everything else, and yet we keep talking about compression as though it doesn't exist. If LZTurbo is the best, then that's what new codecs should be compared to.

  7. #6
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    752
    Thanks
    216
    Thanked 283 Times in 165 Posts
    All these codecs you mention (possibly zlib not) are on the pareto-optimal curve for either or both of density/compression speed and density/decompression speed for some platform and some use case.

  8. #7
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    239
    Thanks
    95
    Thanked 47 Times in 31 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    All these codecs you mention (possibly zlib not) are on the pareto-optimal curve for either or both of density/compression speed and density/decompression speed for some platform and some use case.
    I'm puzzled. These are the codecs I tend to see winning benchmarks. Are you saying Brotli is better? I've not seen Brotli win any benchmarks against these – can you point me to some? Is this about the window size issue and the validity of certain benchmarks?

    And RAD isn't standing still. Charles just posted this: http://cbloomrants.blogspot.com/2018...de-speeds.html

    Do you have any benchmarks or executables for Shared Brotli yet?

  9. #8
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    752
    Thanks
    216
    Thanked 283 Times in 165 Posts
    Quote Originally Posted by SolidComp View Post
    Is this about the window size issue and the validity of certain benchmarks?
    Many benchmarks (like Matt Mahoney's large text compression benchmark) on brotli are comparing a 22 or 24 bit window size to a say 29 bit window size with another codec. This gives a 10-20 % density disadvantage for brotli (or more if you prep the data by inserting duplicates far apart).

    Compressionette tests were performed using the same window size. Summary at https://encode.su/threads/2947-large...ll=1#post56651

    Sportman's quick test at https://encode.su/threads/2947-large...ll=1#post56655 shows brotli clearly at decoding speed/density pareto-curve. zstd is the first faster-to-decode algorithm there, but creates 5 % more bytes.

    LZ Turbo benchmarks seem to have a higher quality than many others: https://github.com/powturbo/TurboBench and https://sites.google.com/site/powtur...eb-compression

  10. #9
    Member
    Join Date
    May 2017
    Location
    Sealand
    Posts
    15
    Thanks
    7
    Thanked 2 Times in 2 Posts
    Test results of divans on enwik8 and enwik9

    enwik8: 27,071,566 bytes

    enwik9: 235,781,511 bytes

  11. #10
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    236
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by Chirantan View Post
    Test results of divans on enwik8 and enwik9

    enwik8: 27,071,566 bytes

    enwik9: 235,781,511 bytes
    Do you happen to have a working binary you can share? I guess you had to some editing, because the official site says:

    You can see an expanded version of this example, as well as try DivANS on your own files (up to 2MB) interactively at https://dropbox.github.io/divans.
    Thanks in advance, if it is the case.

  12. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,717
    Thanks
    271
    Thanked 1,185 Times in 656 Posts

  13. #12
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    236
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by Shelwien View Post
    Ocam's razor... And I still don't get it
    Thanks Shelwien

Similar Threads

  1. Nibble MMC matchfinder
    By Shelwien in forum Data Compression
    Replies: 4
    Last Post: 25th April 2011, 13:21

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •