Page 8 of 10 FirstFirst ... 678910 LastLast
Results 211 to 240 of 275

Thread: Brotli

  1. #211
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    Shared brotli is a possibility there.
    What is that? A shared dictionary or schema like Protocol Buffers uses?

  2. #212
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    FYI, there have been several Brotli releases in the last few months: 1.0.3 - 1.0.6. It may perform better, so maybe some fresh benchmarks are in order. 1.0.5 was supposed to improve q=1 compression on small files, and the release notes for 1.0.6 are:


    • fix unaligned 64-bit accesses on AArch32
    • add missing files to the sources list
    • add ASAN/MSAN unaligned read specializations
    • fix CoverityScan "unused assignment" warning
    • fix JDK 8<->9 incompatibility
    • unbreak Travis builds
    • fix auto detect of bundled mode in cmake

  3. #213
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    340
    Thanks
    195
    Thanked 58 Times in 42 Posts
    v1.0.6

    Code:
    fix unaligned 64-bit accesses on AArch32
    add missing files to the sources list
    add ASAN/MSAN unaligned read specializations
    fix CoverityScan "unused assignment" warning
    fix JDK 8<->9 incompatibility
    unbreak Travis builds
    fix auto detect of bundled mode in cmake
    Please report any problems. Compiled with ICC15. GCC compiles are much slower.
    Attached Files Attached Files

  4. Thanks (3):

    hunman (30th September 2018),Sportman (7th January 2020),Stephan Busch (20th August 2019)

  5. #214
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,006
    Thanks
    97
    Thanked 401 Times in 279 Posts
    Quote Originally Posted by comp1 View Post
    Please report any problems.
    I get corrupt input error with Brotli 64-bit v1.0.6 and enwik9 decompress (compressed with -q 7 -w 22)

    A Brotli v1.07 Windows binary is also very welcome.

  6. #215
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,946
    Thanks
    294
    Thanked 1,286 Times in 728 Posts
    Compiled the source from repository.
    As usual, needs patches for IntelC/win.
    Attached Files Attached Files

  7. Thanks:

    Sportman (7th January 2020)

  8. #216
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,006
    Thanks
    97
    Thanked 401 Times in 279 Posts
    Input:
    1,000,000,000 bytes - enwik9

    Output:
    264,616,803 bytes

    brotli_gc82 -q 7 -w 22:

    1 core:
    50.330 2.005 RAMDISK
    50.821 2.063 NVMe
    50,585 2.035 SSD
    50,666 2.172 HD

    2+ cores:
    48.607 1.953 RAMDISK
    48.513 1.966 NVMe
    48.414 1.968 SSD
    48.498 2.100 HD

    brotli_ic19 -q 7 -w 22:

    1 core:
    53.445 2.235 RAMDISK
    53.386 2.208 NVMe
    53.570 2.228 SSD
    53.336 2.455 HD

    2+ cores:
    51.388 2.146 RAMDISK
    51.624 2.146 NVMe
    51.496 2.149 SSD
    51.662 2.178 HD

  9. #217
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,006
    Thanks
    97
    Thanked 401 Times in 279 Posts
    Input:
    10,000,000,000 bytes - enwik10

    Code:
    Output:
    3,770,151,519 bytes,     34.216 sec. - 26.236 sec., Brotli -0
    3,545,817,857 bytes,     42.569 sec. - 24.609 sec., Brotli -1
    3,268,318,786 bytes,     84.742 sec. - 22.775 sec., Brotli -2
    3,220,150,965 bytes,     94.157 sec. - 22.117 sec., Brotli -3
    3,095,248,795 bytes,    137.881 sec. - 20.738 sec., Brotli -4
    2,872,161,150 bytes,    251.839 sec. - 20.411 sec., Brotli -5
    2,780,438,122 bytes,    335.040 sec. - 19.575 sec., Brotli -6
    2,642,417,079 bytes,    603.066 sec. - 18.940 sec., Brotli -7
    2,573,497,114 bytes,  1,007.449 sec. - 18.956 sec., Brotli -8
    2,512,659,846 bytes,  1,734.988 sec. - 18.909 sec., Brotli -9
    2,220,027,943 bytes,  7,439.064 sec. - 22.690 sec., Brotli -10
    2,172,589,967 bytes, 14,232.804 sec. - 20.310 sec., Brotli -11

  10. #218
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    846
    Thanks
    242
    Thanked 309 Times in 184 Posts
    Brotli 10 and 11 should be relatively strong on this if enwik10 contains Japanese and other utf-8 heavy languages. If not, we need to take a look at the hashing...

  11. #219
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    328
    Thanks
    50
    Thanked 61 Times in 49 Posts
    Quote Originally Posted by Sportman View Post
    Input:
    10,000,000,000 bytes - enwik10

    Code:
    Output:
    3,770,151,519 bytes,     34.216 sec. - 26.236 sec., Brotli -0
    3,545,817,857 bytes,     42.569 sec. - 24.609 sec., Brotli -1
    3,268,318,786 bytes,     84.742 sec. - 22.775 sec., Brotli -2
    3,220,150,965 bytes,     94.157 sec. - 22.117 sec., Brotli -3
    3,095,248,795 bytes,    137.881 sec. - 20.738 sec., Brotli -4
    2,872,161,150 bytes,    251.839 sec. - 20.411 sec., Brotli -5
    2,780,438,122 bytes,    335.040 sec. - 19.575 sec., Brotli -6
    2,642,417,079 bytes,    603.066 sec. - 18.940 sec., Brotli -7
    2,573,497,114 bytes,  1,007.449 sec. - 18.956 sec., Brotli -8
    2,512,659,846 bytes,  1,734.988 sec. - 18.909 sec., Brotli -9
    2,220,027,943 bytes,  7,439.064 sec. - 22.690 sec., Brotli -10
    2,172,589,967 bytes, 14,232.804 sec. - 20.310 sec., Brotli -11
    Could you compare with crush v1.4 ??

  12. #220
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 199 Times in 147 Posts
    Can you please post the full command lines used for each level and the I/O devices (HD,NVM?).

  13. #221
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,006
    Thanks
    97
    Thanked 401 Times in 279 Posts
    Quote Originally Posted by dnd View Post
    Can you please post the full command lines used for each level and the I/O devices (HD,NVM?).
    I created and started a batch file bro.bat see bro.zip
    NVMe used.
    Attached Files Attached Files

  14. #222
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 199 Times in 147 Posts
    Better use the "large_window=30" option with quality 10 and 11.

  15. #223
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,006
    Thanks
    97
    Thanked 401 Times in 279 Posts
    Quote Originally Posted by dnd View Post
    Better use the "large_window=30" option with quality 10 and 11.
    I start first with default options.
    If I use it, then I write it, like here https://encode.su/threads/2119-Zstan...ll=1#post62944

  16. #224
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 199 Times in 147 Posts
    TurboBench is using large window brotli with q 10 and 11 per default.
    Comparison with other compressors is not possible with 8/16 MB window when you don't specify the "large_window" option.

  17. #225
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Can someone please build Windows executables for brotli 1.0.7 (the latest)? I don't have VS set up on my new computers yet.

    Brotli GitHub repo: https://github.com/google/brotli

    Thanks.

  18. #226
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,946
    Thanks
    294
    Thanked 1,286 Times in 728 Posts
    Somehow I don't think VS would help you... without patches its kinda gcc-only.
    Attached Files Attached Files

  19. Thanks:

    SolidComp (20th March 2020)

  20. #227
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by Shelwien View Post
    Somehow I don't think VS would help you... without patches its kinda gcc-only.
    ‚ÄčI'm pretty sure I built it in the past with the help of CMake. It was Visual Studio 2017. Now I'm on 2019. Thanks for the build.

  21. #228
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by Shelwien View Post
    Somehow I don't think VS would help you... without patches its kinda gcc-only.
    Yo, the file was flagged by Microsoft as a trojan (Windows 10 blocked me from opening it), and by nine engines on VirusTotal. Here's an example of what I'm seeing:

    Click image for larger version. 

Name:	VT.JPG 
Views:	56 
Size:	58.0 KB 
ID:	7502

  22. #229
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,946
    Thanks
    294
    Thanked 1,286 Times in 728 Posts
    I added gcc exe to the post above without archive. Virustotal dislikes intelc version for some reason.

  23. #230
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts

    Appending data to a precompressed brotli stream/file

    Does anyone know if you can straightforwardly append new data to an already compressed brotli file/stream? You can do it with deflate/ZIP, as explained by Mark Adler here:

    ...what you need to do is use the local header, data, and central header from each individual zip file, write the local header and data as is sequentially to the new zip file, and save the central header and the offsets of the local headers in the new file. Then at the end of the new file save the current offset, write a new central directory using the central headers you saved, updating the offsets appropriately, and ending with a new end of central directory record with the offset of the start of the central directory.
    Brotli uses a lot more CPU and memory than gzip, so it's better suited for precompression than on-the-fly. So what I want to do is precompress HTML pages for a dynamic site. Let's say the dynamic data is just logged in user info, to do things like display a username and avatar on top of a page, much like encode.su does. So most of the HTML source could be precompressed, except for some user data dynamically added to the bottom of the file, probably in a script element. Is this easy in principle, just updating the offsets? Does it matter what level of compression is used on the precompressed file? I have the impression that brotli 11 is materially different from say 4.

  24. #231
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,946
    Thanks
    294
    Thanked 1,286 Times in 728 Posts
    > Does anyone know if you can straightforwardly append new data to an already compressed brotli file/stream?

    You can find a solution here: https://github.com/google/brotli/issues/628

    > You can do it with deflate/ZIP, as explained by Mark Adler here:

    Your quote talks about zip archive format, its not a single stream.
    In that case it doesn't matter which codec is actually used by archive format.

    While for raw deflate its tricky (same as for brotli) - normally header of last block in a stream
    has a "last" flag set, so to append to a normal deflate stream we'd have
    to decode it to locate flag position and precise stream length in bits.

    When necessary, its solved by using a modified encoder which generates
    a byte-aligned zero-length "stored" block at the end.

    Basically the same thing has to be done for brotli.

  25. #232
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    846
    Thanks
    242
    Thanked 309 Times in 184 Posts
    Quote Originally Posted by SolidComp View Post
    Brotli uses a lot more CPU and memory than gzip, so it's better suited for precompression than on-the-fly.
    While this is repeated a lot it is actually not correct. Brotli uses less CPU and roughly the same amount of memory as gzip. Brotli 0 uses about 3x less CPU than gzip's fastest mode. Brotli 0 (and 1) are more streaming at encoding than gzip so you get the data out faster with less buffering. Brotli can be used with 1 kB backward reference window with a single entropy code, so decoding memory needs are also minimal.

    For every gzip quality setting there is faster brotli encoding setting to a higher density.

    Brotli's slowest mode 11 uses about 2x less CPU than pigz quality 11 or zopfli (at default settings), which are the highest density implementations for gzip.

  26. Thanks:

    JamesB (28th April 2020)

  27. #233
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    While this is repeated a lot it is actually not correct. Brotli uses less CPU and roughly the same amount of memory as gzip. Brotli 0 uses about 3x less CPU than gzip's fastest mode. Brotli 0 (and 1) are more streaming at encoding than gzip so you get the data out faster with less buffering. Brotli can be used with 1 kB backward reference window with a single entropy code, so decoding memory needs are also minimal.

    For every gzip quality setting there is faster brotli encoding setting to a higher density.

    Brotli's slowest mode 11 uses about 2x less CPU than pigz quality 11 or zopfli (at default settings), which are the highest density implementations for gzip.
    This seems spinny and invalid. You're comparing to zopfli, which is notoriously slow and CPU intensive, and pigz, which is multithreaded and has intensive modes up past 9 (zlib gzip stops at 9, has no 11).

    Apples to apples would be something like brotli 4 to zlib gzip 6 or something. It's zlib's gzip that servers actually use, not zopfli. (And the lightest gzip is SLZ.) How much CPU and RAM does brotli 4 use?

  28. #234
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    By the way, what's happening with brotli's releases? The last release was in 2018: https://github.com/google/brotli/releases

  29. #235
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts

    Brotli 1.0.7 built in Visual Studio

    Attached is an optimized brotli 1.0.7 built in Visual Studio 2019, with the help of CMake.

    It's optimized in that I enabled Whole Program Optimization (aka Link-Time Optimization in gcc) and Function-Level Linking. I don't know much about the latter. I made the CPU target floor SSE2 so that it will run an any computer of the past 15 years or so.

    It's a directory containing the .exe and a bunch of .lib and .dll files. I'm not sure how to build a static .exe with Visual Studio or if that messes up the optimizations.
    Attached Files Attached Files

  30. #236
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    846
    Thanks
    242
    Thanked 309 Times in 184 Posts
    Quote Originally Posted by SolidComp View Post
    Visual Studio 2019
    In the past I have observed claims of gcc/clang to produce significantly faster code for brotli than Visual Studio. I wonder if this is still the case and if someone with access to both compilers could try it. Could be that visual studio is under or over-inlining or there might be another simple explanation.

  31. #237
    Member
    Join Date
    May 2019
    Location
    Japan
    Posts
    27
    Thanks
    4
    Thanked 8 Times in 4 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    Brotli uses less CPU and roughly the same amount of memory as gzip.
    I wonder how you measure memory consumption to make this claim? In my benchmark gzip uses much less memory than brotli:

    http://kirr.dyndns.org/sequence-comp...ow+scatterplot

    Quote Originally Posted by Jyrki Alakuijala View Post
    Brotli 0 uses about 3x less CPU than gzip's fastest mode. Brotli 0 (and 1) are more streaming at encoding than gzip so you get the data out faster with less buffering. Brotli can be used with 1 kB backward reference window with a single entropy code, so decoding memory needs are also minimal.

    For every gzip quality setting there is faster brotli encoding setting to a higher density.

    Brotli's slowest mode 11 uses about 2x less CPU than pigz quality 11 or zopfli (at default settings), which are the highest density implementations for gzip.
    Comparison with zopfli (and pigz -11) is irrelevant since zopfli is too far from being practical. However, I confirm that brotli usually can offer faster and stronger setting compared to any gzip level (at least on my data).

    E.g., comparison on human genome:

    http://kirr.dyndns.org/sequence-comp...ow+scatterplot

  32. Thanks (2):

    Mike (22nd May 2020),SolidComp (22nd May 2020)

  33. #238
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    In the past I have observed claims of gcc/clang to produce significantly faster code for brotli than Visual Studio. I wonder if this is still the case and if someone with access to both compilers could try it. Could be that visual studio is under or over-inlining or there might be another simple explanation.
    There's not enough careful research on these different compilers and their performance with compressors. It's a shame. The compiler that is most intriguing to me is Intel's C/C++ compiler, called Parallel Studio. What smattering of data exist suggests that it's better than gcc, Visual Studio, clang, etc. Why don't your teams at Google use it? Google can certainly afford it. It seems to be especially good at vectorization, and it has great OpenMP support to help parallelize applications.

    The Intel compiler is actually free for open source contributors. Linux version only.

    If I were better at Python or something I rig up an automated benchmarking framework.

  34. #239
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Kirr, what's the w30 bit at the end of the brotli versions, like brotli-11w30?

  35. #240
    Member
    Join Date
    May 2019
    Location
    Japan
    Posts
    27
    Thanks
    4
    Thanked 8 Times in 4 Posts
    Quote Originally Posted by SolidComp View Post
    Kirr, what's the w30 bit at the end of the brotli versions, like brotli-11w30?
    It's their secret undocumented setting:

    Code:
    brotli -q 11 --large_window=30 -c >{OUT}
    On this page I specify each and every command line used in the benchmark: http://kirr.dyndns.org/sequence-comp...?page=Commands

Page 8 of 10 FirstFirst ... 678910 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •