Page 12 of 16 FirstFirst ... 21011121314 ... LastLast
Results 331 to 360 of 453

Thread: Zstandard

  1. #331
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,026
    Thanks
    103
    Thanked 410 Times in 285 Posts
    Zstd v1.4.0 64-bits:
    https://github.com/facebook/zstd/releases

    Input:
    enwik9

    Output:
    357,515,962 bytes, 2.361 sec. - 1.130 sec., -1
    329,419,520 bytes, 3.353 sec. - 1.235 sec., -2
    314,012,348 bytes, 4.333 sec. - 1.287 sec., -3
    308,269,618 bytes, 4.584 sec. - 1.346 sec., -4
    302,075,641 bytes, 7.075 sec. - 1.353 sec., -5
    295,590,745 bytes, 9.556 sec. - 1.338 sec., -6
    285,397,982 bytes, 13.247 sec. - 1.285 sec., -7
    281,180,106 bytes, 16.799 sec. - 1.241 sec., -8
    278,757,548 bytes, 24.206 sec. - 1.242 sec., -9
    273,782,728 bytes, 29.656 sec. - 1.246 sec., -10
    271,392,063 bytes, 38.660 sec. - 1.256 sec., -11
    269,321,220 bytes, 59.222 sec. - 1.236 sec., -12
    266,022,487 bytes, 75.572 sec. - 1.212 sec., -13
    261,574,115 bytes, 101.186 sec. - 1.225 sec., -14
    258,869,397 bytes, 138.761 sec. - 1.232 sec., -15
    250,212,437 bytes, 156.968 sec. - 1.222 sec., -16
    242,902,736 bytes, 232.295 sec. - 1.205 sec., -17
    239,765,452 bytes, 273.279 sec. - 1.245 sec., -18
    235,698,881 bytes, 368.606 sec. - 1.277 sec., -19
    226,024,466 bytes, 467.710 sec. - 1.428 sec., -20
    220,222,797 bytes, 529.841 sec. - 1.483 sec., -21
    215,032,608 bytes, 569.795 sec. - 1.507 sec., -22

  2. Thanks (4):

    avitar (28th April 2019),Bulat Ziganshin (28th April 2019),Cyan (29th April 2019),Jango (30th April 2019)

  3. #332
    Member
    Join Date
    May 2007
    Location
    Poland
    Posts
    91
    Thanks
    9
    Thanked 4 Times in 4 Posts
    How to use dictionary with ZSTD?
    I tried:
    Code:
    zstd  BJ_all_Corr.csv -D dict.1 -o dict.zstd
    zstd: cannot use BJ_all_Corr.csv as an input file and dictionary
    How do i tell zstd which file is the trained dictonary (dict.1 here)?

  4. #333
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,977
    Thanks
    296
    Thanked 1,306 Times in 742 Posts
    zstd dictionary file is not raw, you have to build it first:
    Code:
    Dictionary builder :
    --train ## : create a dictionary from a training set of files
    --train-cover[=k=#,d=#,steps=#,split=#] : use the cover algorithm with optional args
    --train-fastcover[=k=#,d=#,f=#,steps=#,split=#,accel=#] : use the fast cover algorithm with optional args
    --train-legacy[=s=#] : use the legacy algorithm with selectivity (default: 9)
     -o file : `file` is dictionary name (default: dictionary)
    --maxdict=# : limit dictionary to specified size (default: 112640)
    --dictID=# : force dictionary ID to specified value (default: random)

  5. #334
    Member
    Join Date
    May 2007
    Location
    Poland
    Posts
    91
    Thanks
    9
    Thanked 4 Times in 4 Posts
    Yes, this is the dict.1 file
    Code:
    zstd --train .\train\*  -o dict.1
    Trying 5 different sets of parameters
    k=1998
    d=8
    f=20
    steps=4
    split=75
    accel=1
    Save dictionary of size 112640 into file dict.1

  6. #335
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    889
    Thanks
    483
    Thanked 279 Times in 119 Posts
    Hi @Jethro

    This command line you present should have worked.

    zstd FILE -D dict -o dest
    is a usual construction that is known and well tested.

    I can't explain from this snippet why it would not work for you ....

  7. #336
    Member
    Join Date
    May 2007
    Location
    Poland
    Posts
    91
    Thanks
    9
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Cyan View Post
    Hi @Jethro

    This command line you present should have worked.



    is a usual construction that is known and well tested.

    I can't explain from this snippet why it would not work for you ....
    Thanks Cyan
    *** zstd command line interface 64-bits v1.4.2, by Yann Collet ***
    Win 10

  8. #337
    Member
    Join Date
    Jan 2017
    Location
    Germany
    Posts
    63
    Thanks
    31
    Thanked 14 Times in 11 Posts
    I have got a question:
    Is Zstandard capable of compressing very large files, e.g. files of 40 Gbytes each, or is there a file size limit?

  9. #338
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,977
    Thanks
    296
    Thanked 1,306 Times in 742 Posts
    No filesize limit normally. Zstd API (zstd.h) doesn't even work with files, but rather streams of unknown length.

    A custom file format which uses zstd for compression can easily have such limits though.

  10. Thanks:

    WinnieW (12th October 2019)

  11. #339
    Member
    Join Date
    Jan 2017
    Location
    Germany
    Posts
    63
    Thanks
    31
    Thanked 14 Times in 11 Posts
    I can confirm there is no problem. I compressed a file of 38 Gbyte of size using the official 64 Bit Windows command line binary. Verified the file integrity using SHA1 checksums. Original file and decompressed file were bit identical.

  12. Thanks:

    Cyan (13th October 2019)

  13. #340
    Member
    Join Date
    May 2007
    Location
    Poland
    Posts
    91
    Thanks
    9
    Thanked 4 Times in 4 Posts
    Hi @Jethro

    This command line you present should have worked.

    zstd FILE -D dict -o dest
    is a usual construction that is known and well tested.

    I can't explain from this snippet why it would not work for you ....
    https://github.com/facebook/zstd/issues/1817

  14. Thanks:

    Shelwien (13th October 2019)

  15. #341
    Member
    Join Date
    Aug 2015
    Location
    The Earth
    Posts
    12
    Thanks
    3
    Thanked 21 Times in 7 Posts

    Post Zstandard v1.4.4

    https://github.com/facebook/zstd/releases/tag/v1.4.4
    This release includes some major performance improvements and new CLI features, which make it a recommended upgrade.

    Faster Decompression Speed
    Decompression speed has been substantially improved, thanks to @terrelln. Exact mileage obviously varies depending on files and scenarios, but the general expectation is a bump of about +10%. The benefit is considered applicable to all scenarios, and will be perceptible for most usages.

    Faster Compression Speed when Re-Using Contexts
    In server workloads (characterized by very high compression volume of relatively small inputs), the allocation and initialization of zstd's internal datastructures can become a significant part of the cost of compression. For this reason, zstd has long had an optimization (which we recommended for large-scale users, perhaps with something like this): when you provide an already-used ZSTD_CCtx to a compression operation, zstd tries to re-use the existing data structures, if possible, rather than re-allocate and re-initialize them.
    Historically, this optimization could avoid re-allocation most of the time, but required an exact match of internal parameters to avoid re-initialization. In this release, @felixhandte removed the dependency on matching parameters, allowing the full context re-use optimization to be applied to effectively all compressions. Practical workloads on small data should expect a ~3% speed-up.
    In addition to improving average performance, this change also has some nice side-effects on the extremes of performance.

    • On the fast end, it is now easier to get optimal performance from zstd. In particular, it is no longer necessary to do careful tracking and matching of contexts to compressions based on detailed parameters (as discussed for example in #1796). Instead, straightforwardly reusing contexts is now optimal.
    • Second, this change ameliorates some rare, degenerate scenarios (e.g., high volume streaming compression of small inputs with varying, high compression levels), in which it was possible for the allocation and initialization work to vastly overshadow the actual compression work. These cases are up to 40x faster, and now perform in-line with similar happy cases.


    Dictionaries and Large Inputs

    In theory, using a dictionary should always be beneficial. However, due to some long-standing implementation limitations, it can actually be detrimental. Case in point: by default, dictionaries are prepared to compress small data (where they are most useful). When this prepared dictionary is used to compress large data, there is a mismatch between the prepared parameters (targeting small data) and the ideal parameters (that would target large data). This can cause dictionaries to counter-intuitively result in a lower compression ratio when compressing large inputs.
    Starting with v1.4.4, using a dictionary with a very large input will no longer be detrimental. Thanks to a patch from @senhuang42, whenever the library notices that input is sufficiently large (relative to dictionary size), the dictionary is re-processed, using the optimal parameters for large data, resulting in improved compression ratio.
    The capability is also exposed, and can be manually triggered using ZSTD_dictForceLoad.

    New commands

    zstd CLI extends its capabilities, providing new advanced commands, thanks to great contributions :

    • zstd generated files (compressed or decompressed) can now be automatically stored into a different directory than the source one, using --output-dir-flat=DIR command, provided by @senhuang42 .
    • It’s possible to inform zstd about the size of data coming from stdin . @nmagerko proposed 2 new commands, allowing users to provide the exact stream size (--stream-size=# ) or an approximative one (--size-hint=#). Both only make sense when compressing a data stream from a pipe (such as stdin), since for a real file, zstd obtains the exact source size from the file system. Providing a source size allows zstd to better adapt internal compression parameters to the input, resulting in better performance and compression ratio. Additionally, providing the precise size makes it possible to embed this information in the compressed frame header, which also allows decoder optimizations.
    • In situations where the same directory content get regularly compressed, with the intention to only compress new files not yet compressed, it’s necessary to filter the file list, to exclude already compressed files. This process is simplified with command --exclude-compressed, provided by @shashank0791 . As the name implies, it simply excludes all compressed files from the list to process.


    Single-File Decoder with Web Assembly

    Let’s complete the picture with an impressive contribution from @cwoffenden. libzstd has long offered the capability to build only the decoder, in order to generate smaller binaries that can be more easily embedded into memory-constrained devices and applications.
    @cwoffenden built on this capability and offers a script creating a single-file decoder, as an amalgamated variant of reference Zstandard’s decoder. The package is completed with a nice build script, which compiles the one-file decoder into WASM code, for embedding into web application, and even tests it.
    As a capability example, check out the awesome WebGL demo provided by @cwoffenden in /contrib/single_file_decoder/examples directory!

    Full List

    • perf: Improved decompression speed, by > 10%, by @terrelln
    • perf: Better compression speed when re-using a context, by @felixhandte
    • perf: Fix compression ratio when compressing large files with small dictionary, by @senhuang42
    • perf: zstd reference encoder can generate RLE blocks, by @bimbashrestha
    • perf: minor generic speed optimization, by @davidbolvansky
    • api: new ability to extract sequences from the parser for analysis, by @bimbashrestha
    • api: fixed decoding of magic-less frames, by @terrelln
    • api: fixed ZSTD_initCStream_advanced() performance with fast modes, reported by @QrczakMK
    • cli: Named pipes support, by @bimbashrestha
    • cli: short tar's extension support, by @stokito
    • cli: command --output-dir-flat=DIE , generates target files into requested directory, by @senhuang42
    • cli: commands --stream-size=# and --size-hint=#, by @nmagerko
    • cli: command --exclude-compressed, by @shashank0791
    • cli: faster -t test mode
    • cli: improved some error messages, by @vangyzen
    • cli: fix rare deadlock condition within dictionary builder, by @terrelln
    • build: single-file decoder with emscripten compilation script, by @cwoffenden
    • build: fixed zlibWrapper compilation on Visual Studio, reported by @bluenlive
    • build: fixed deprecation warning for certain gcc version, reported by @jasonma163
    • build: fix compilation on old gcc versions, by @cemeyer
    • build: improved installation directories for cmake script, by Dmitri Shubin
    • pack: modified pkgconfig, for better integration into openwrt, requested by @neheb
    • misc: Improved documentation : ZSTD_CLEVEL, DYNAMIC_BMI2, ZSTD_CDict, function deprecation, zstd format
    • misc: fixed educational decoder : accept larger literals section, and removed UNALIGNED() macro



  16. Thanks (3):

    Cyan (26th November 2019),Mike (8th November 2019),schnaader (8th November 2019)

  17. #342
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    803
    Thanks
    244
    Thanked 255 Times in 159 Posts
    Popular HN thread regarding recent Arch Linux zstd package compression: https://news.ycombinator.com/item?id=21958585

    Also leading to nice benchmarks: http://pages.di.unipi.it/farruggia/dcb/

  18. #343
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Quote Originally Posted by Jarek View Post
    Popular HN thread regarding recent Arch Linux zstd package compression: https://news.ycombinator.com/item?id=21958585

    Also leading to nice benchmarks: http://pages.di.unipi.it/farruggia/dcb/
    That benchmark is comparing 16 MB window brotli against 128+ MB window zstd. You cannot learn much from it.

    If you rerun it with a large window brotli, brotli will be consistently 5-10 % more dense than zstd.

  19. #344
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    803
    Thanks
    244
    Thanked 255 Times in 159 Posts
    Indeed this benchmark does not looks fair, but it is not you who should complain: it tests 14 Brotli configurations and only one for zstd.

    Whose fault is it that someone testing 14 configurations of your program does not get to the ones you would like him to use?

  20. #345
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Quote Originally Posted by Jarek View Post
    Indeed this benchmark does not looks fair, but it is not you who should complain: it tests 14 Brotli configurations and only one for zstd.

    Whose fault is it that someone testing 14 configurations of your program does not get to the ones you would like him to use?
    There is a fair variation of this benchmark. If you choose chunked data then there is no difference in window setting.

    Further, you need to enable tables and look at results from them instead of plots, some plots the best brotli result doesn't show as it is literally of the charts in density

  21. #346
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    I'd say the unfairness is zstd's fault. They bundle decoding memory use with encoding effort in a very confusing and impractical way.

  22. #347
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    803
    Thanks
    244
    Thanked 255 Times in 159 Posts
    I don't think author of this benchmark is in this forum - you should complain directly to all these benchmark's authors showing inferiority of Brotli.

  23. #348
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Quote Originally Posted by Jarek View Post
    I don't think author of this benchmark is in this forum - you should complain directly to all these benchmark's authors showing inferiority of Brotli.
    I just don't understand why people would want to see algorithm 1 with window of 128 MB and algorithm 2 with window of 16 MB.

    I don't understand why people would want to celebrate faulty benchmarking when perfectly implemented benchmarking has been done, too.

    LzTurbo benchmarking at https://github.com/powturbo/TurboBench is highly disciplined, but possibly with a non-optimal compiler for brotli (I didn't see it so slow against others on my compilations). Unfortunately they later removed zstd from their web data benchmark. Also, they highlight pd3d.tar which is the worst-case data for brotli from all that I saw. Nonetheless, they don't fall into the usual traps (like comparing different LZ window sizes to each other).

    Georgi 'Sanmayce' benchmark at https://github.com/google/brotli/issues/642 is the most complete large data benchmark that I know of.

  24. #349
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 199 Times in 147 Posts
    TurboBench Compression Benchmark is using brotli with max. window size (1GB) already in the default mode.
    No need to specify anything.
    Turbobench can be compiled with gcc,clang,intel on linux, mingw on windows and on MacOs.
    All libs are compiled with the option "-O3".
    I can't see what you mean with non-optimal compiler.

    On the web compression benchmark, I'm benchmarking only libraries compatible with http compression (gzip,br).

    I suggest putting a link to turbobench and web compression on the github brotli site.
    Users can do their benchmarks with their own data and with large window brotli as default.
    Most users don't want to study or specify the options offered by the compressors.

    Georgi 'Sanmayce' is also using turbobench for brotli.

  25. #350
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Quote Originally Posted by dnd View Post
    TurboBench Compression Benchmark is using brotli with max. window size (1GB) already in the default mode.
    No need to specify anything.
    Turbobench can be compiled with gcc,clang,intel on linux, mingw on windows and on MacOs.
    All libs are compiled with the option "-O3".
    I can't see what you mean with non-optimal compiler.
    If I grab brotli and zstd from github and compress the enwik8 with -q 7 -w 23, and -13 on zstd, I get the following results:

    Compression time:
    Brotli: 8.3 s (12 MB/s)
    ZStd: 11.2 s (8.9 MB/s)

    Size:
    Brotli: 30205556 bytes
    ZStd: 30365678 bytes

    Decompression time:
    Brotli: 326 ms (307 MB/s)
    ZStd: 234 ms (427 MB/s)

    I see a 39 % faster decompression from zstd, but 34 % faster compression and 0.5 % better density for brotli at these compression settings.

    TurboBench sees a ~200 % faster decompression from zstd. There is a big difference between 39 % and 200 %.

    My compiler for this test was gcc version 8.3.0., but pretty much always I see 30-50 % faster decompression for zstd, but otherwise brotli outperforms zstd, i.e., in compression speed and density. I never saw 200 % faster decompression for zstd, no matter which data I tried. This is why I presumed a different compiler (one that is struggling with brotli code) or compiler settings...

    Quote Originally Posted by dnd View Post
    On the web compression benchmark, I'm benchmarking only libraries compatible with http compression (gzip,br).
    zstd is running for the web, too. They have a published RFC https://datatracker.ietf.org/doc/rfc8478/ and application/zstd data type.

    You publishing the data for zstd, too, would allow discussions around this RFC to be grounded with real data.

    Quote Originally Posted by dnd View Post
    I suggest putting a link to turbobench and web compression on the github brotli site.
    Please file an issue to the brotli github? This sounds like a good thing to do.
    Last edited by Jyrki Alakuijala; 6th January 2020 at 21:08. Reason: Had -q 8 in the text, bust was actually using -q 7

  26. Thanks:

    Shelwien (6th January 2020)

  27. #351
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 199 Times in 147 Posts
    Not sure, if your numbers are with or without I/O.

    I've made the same benchmark as yours with the LATEST github versions (6 JAN 2020) .
    There is no difference in speed between the builds from the makefiles provided in brotli or turbobench own makfile.

    See: updated benchmark


    As stated before turbobench is using large window brotli as default for all levels.
    You have specified only w23 (8MB) as window size.
    The compression ratio and speed diff. can be significant with other files when you use a larger dic/window size as 8MB.
    Last edited by dnd; 7th January 2020 at 11:47.

  28. #352
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Sorry, looks like I had a typo in the text, I was using -q 7, not -q 8 for brotli. Would you kindly rerun, too

    Still rather interesting that not doing I/O changes the 39 % difference to 100+ % difference. I have 192 GB ram in this machine, the I/O is definitely not waiting for the disk. Perhaps the additional memory or cache pressure from the memory writes related to I/O reduces the decoding speed differences.

  29. #353
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Quote Originally Posted by dnd View Post
    The compression ratio and speed diff. can be significant with other files when you use a larger dic/window size as 8MB.
    I chose 8 MB backward window as I'm expecting zstd to use that window at -13 (I could be wrong with this assumption).

  30. #354
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,977
    Thanks
    296
    Thanked 1,306 Times in 742 Posts
    Afaiu zstd level 13 has 4M window:
    Code:
    {   /* "default" - for any srcSize > 256 KB */
        /* W,  C,  H,  S,  L, TL, strat */
        { 22, 21, 22,  5,  5, 32, ZSTD_btlazy2 },  /* level 13 */
    [...]
        { 23, 23, 22,  5,  4, 64, ZSTD_btopt   },  /* level 17 */
    }
    but then it still seems 2x faster?
    Code:
          C Size  ratio%     C MB/s     D MB/s   Name            File
        29464303    29.5       5.66     319.40   brotli 8        enwik8
        30205556    30.2      10.85     463.14   brotli 7w23     enwik8
        29582960    29.6       6.44     450.92   brotli 8w23     enwik8
        30331609    30.3       7.51     995.91   zstd 13         enwik8
        27702787    27.7       2.82     919.83   zstd 17         enwik8

  31. Thanks:

    dnd (6th January 2020)

  32. #355
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Ok, now command line results with brotli:7:d22 vs zstd -13

    Compression time:
    Brotli: 7.6 s (13.2 MB/s) +48 % faster
    ZStd: 11.2 s (8.9 MB/s)

    Size:
    Brotli: 30306199 bytes
    ZStd: 30365678 bytes (+0.2 % more bloat)

    Decompression time:
    Brotli: 316 ms (316 MB/s)
    ZStd: 234 ms (427 MB/s) +35 % faster

  33. Thanks:

    dnd (6th January 2020)

  34. #356
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    565
    Thanks
    67
    Thanked 199 Times in 147 Posts
    Benchmark updated with more levels.

  35. #357
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Quote Originally Posted by dnd View Post
    Benchmark updated with more levels.
    Do you think it would make sense to profile zstd and brotli with and without I/O. With I/O there is no significant performance difference and without I/O quite dramatic decompression speed difference.

    Perhaps there is just some functional difference that makes zstd a lot faster when actual I/O is disabled? Perhaps brotli does something relatively silly when there is no I/O?

  36. #358
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,977
    Thanks
    296
    Thanked 1,306 Times in 742 Posts
    I think this is what it does for decompression:
    Code:
        case P_BROTLI: { 
            BrotliDecoderState* s = BrotliDecoderCreateInstance(NULL, NULL, NULL); if(!s) return -1;
    	BrotliDecoderSetParameter(s, BROTLI_DECODER_PARAM_LARGE_WINDOW, 1u); 
    	size_t total_out, available_in=inlen, available_out=outlen; uint8_t *next_in=in, *next_out=out;
    	BrotliDecoderResult rc = BrotliDecoderDecompressStream(s, &available_in, (const uint8_t **)&next_in, &available_out, (uint8_t **)&next_out, &total_out); 
            BrotliDecoderDestroyInstance(s);
            return rc?total_out:0; 
          }

  37. #359
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,026
    Thanks
    103
    Thanked 410 Times in 285 Posts
    Input:
    1,000,000,000 bytes - enwik9

    Zstd 1.4.4 -13

    Output:
    265,980,982 bytes

    1 core:
    79.231 1.293, RAMDISK
    79.054 1.381, NVMe
    81.408 1.638, SSD
    86.046 1.367, HD

    265,614,307 bytes, 26.6 %, 13.72 MB/s, 1443.41 MB/s, turbobench (Jan 6 2020)

    2+ core:
    77.443, 1.277, RAMDISK
    77.738, 1.129, NVMe
    77.319, 1.313, SSD
    77.303, 1.612, HD

    265,614,307 bytes, 26.6 %, 14.16 MB/s, 1478.71 MB/s, turbobench (Jan 6 2020)

    ----------------------------------------------------------------------------

    Brotli 1.07 -q 7 -w 22

    Output:
    264,616,803 bytes

    1 core:
    50.330 2.005 RAMDISK
    50.821 2.063 NVMe
    50,585 2.035 SSD
    50,666 2.172 HD

    262,485,325 bytes, 26.2 %, 16.33 MB/s, 441.34 MB/s, turbobench (Jan 6 2020)

    2+ cores:
    48.607 1.953 RAMDISK
    48.513 1.966 NVMe
    48.414 1.968 SSD
    48.498 2.100 HD

    262,485,325 bytes, 26.2 %, 16.91 MB/s, 457.44 MB/s, turbobench (Jan 6 2020)

    All tests Windows 10, all 64-bit.
    Last edited by Sportman; 7th January 2020 at 18:26.

  38. #360
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    876
    Thanks
    242
    Thanked 325 Times in 198 Posts
    Quote Originally Posted by Shelwien View Post
    I think this is what it does for decompression:
    Code:
        case P_BROTLI: { 
            BrotliDecoderState* s = BrotliDecoderCreateInstance(NULL, NULL, NULL); if(!s) return -1;
        BrotliDecoderSetParameter(s, BROTLI_DECODER_PARAM_LARGE_WINDOW, 1u); 
        size_t total_out, available_in=inlen, available_out=outlen; uint8_t *next_in=in, *next_out=out;
        BrotliDecoderResult rc = BrotliDecoderDecompressStream(s, &available_in, (const uint8_t **)&next_in, &available_out, (uint8_t **)&next_out, &total_out); 
            BrotliDecoderDestroyInstance(s);
            return rc?total_out:0; 
          }
    I'm interested in what is the actual profile delta in I/O on or off. Zstd gets 2-3x faster when I/O is off, but brotli doesn't really change speed -- and there was no big difference when I/O is on.

    Perhaps without I/O brotli decoding calls a function or two for every decoded byte. Perhaps the compiler understands that the results are not used for anything and is able to remove some last steps of the decompression in the zstd use case (when results are not used). Computing something out of the data, like a fast checksum or xor of all bytes might further reduce the possibility of that happening.
    Last edited by Jyrki Alakuijala; 7th January 2020 at 12:44.

Page 12 of 16 FirstFirst ... 21011121314 ... LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •