Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • JamesB's Avatar
    Today, 17:51
    I think Libdeflate is the fastest tool out there right now, unless limiting to light-weight "level 1" style in which case maybe libslz wins out. We integrated libdeflate support into Samtools; for (de)compression of sequencing alignment data in the BAM format. I suspect this is the cause of libdeflate becoming an official Ubuntu package as Samtools/htslib have it as a dependency. I recently retested several deflater algorithms on enwik8. Tool Encode Decode Size ------------------------------------------ vanilla 0m5.003s 0m0.517s 36548933 intel 0m3.057s 0m0.503s 36951028 cloudflare 0m2.492s 0m0.443s 36511793 jtkukunas 0m2.956s 0m0.357s 36950998 ng 0m2.022s 0m0.377s 36881293 zstd (gz-6) 0m4.674s 0m0.468s 36548933 libdeflate 0m1.769s 0m0.229s 36648336 Note the file sizes fluctuate a bit. That's within the difference between gzip -5 vs -6 so arguably you'd include that in the time difference too. I also tried them at level 1 compression: Tool Encode Decode Size ------------------------------------------ vanilla 0m1.851s 0m0.546s 42298786 intel 0m0.866s 0m0.524s 56046821 cloudflare 0m1.163s 0m0.470s 40867185 jtkukunas 0m1.329s 0m0.392s 40867185 ng 0m0.913s 0m0.397s 56045984 zstd (gz) 0m1.764s 0m0.475s 42298786 libdeflate 0m1.024s 0m0.235s 39597396 Level 1 is curious as you can see very much how different versions have traded off the encoder algorithm speed vs size efficiency, with cloudflare and jtkukunas apparently using the same algorithm and intel/ng likewise. Libdeflate is no longer the fastest here, but it's not far off and is the smallest so it's in a sweet spot. And for fun, level 9: Tool Encode Decode Size ------------------------------------------ vanilla 0m6.113s 0m0.516s 36475804 intel 0m5.153s 0m0.516s 36475794 cloudflare 0m2.787s 0m0.442s 36470203 jtkukunas 0m5.034s 0m0.365s 36475794 ng 0m2.872s 0m0.371s 36470203 zstd (gz) 0m5.702s 0m0.467s 36475804 libdeflate 0m9.124s 0m0.237s 35197159 All remarkably similar sizes, bar libdeflate which took longer but squashed it considerably more. Libdeflate actually goes up to -12, but it's not a good tradeoff on this file: libdeflate 0m14.660s 0m0.236s 35100586 Edit: I tested 7z max too, but it was comparable to libdeflate max and much slower.
    4 replies | 240 view(s)
  • LucaBiondi's Avatar
    Today, 17:10
    LucaBiondi replied to a thread Paq8pxd dict in Data Compression
    Hi Shelwien, About testing parameters. for ppmd_12_256_1 level 8 order 12 memory 210 MB level 9 order 16 memory 420 MB level 10 order 16 memory 840 MB level 11 order 16 memory 1680 MB level 12 order 16 memory 1680 MB level 13 order 16 memory 1680 MB level 14 order 16 memory 1680 MB level 15 order 16 memory 1680 MB for ppmd_6_64_2 order 6 memory 64 MB (for each level) Do you know if there is a memory limitation allocating more than 1680 MB or 2000 MB? Thank you, Luca
    916 replies | 313474 view(s)
  • cssignet's Avatar
    Today, 15:24
    oops, sorry then (i removed the wrong statement), i have to admit that i did not check results here :). i just did few trials and meet some situations where some PNG would be smaller stored as lossless with JXL (instead of lossy), thought it was one of them my observations were about web usage context only, and how 16 bits/sample PNG are rendered in web browsers anyway
    12 replies | 339 view(s)
  • skal's Avatar
    Today, 15:03
    And yet, sequential codecs are more efficient than parallel ones: tile-based compression have sync points and contention that makes the codec wait for threads to finish. Processing several images separately in parallel doesn't have this inefficiency (providing memory and I/O is not the bottleneck). Actually, sequential codecs are at advantage in some quite important cases: * image burst on phone camera (sensors is taking a sequence of photos in short bursts) * web page rendering (which contains a lot of images, usually. Think YouTube landing page.) * displaying photo albums (/thumbnails) * back-end processing of a lot of photos in parallel (cloudinary?) Actually, I'd say parallel codec are mostly useful for the Photoshop case (where you're using one photo only) and screen sharing (/slide deck). side note: jpeg can be made parallelizable using Restart Markers. Fact that no-one is using it is somewhat telling. In any case, i would have multiplied the JPEG's MP/s by 4x in your table to get fair numbers.
    12 replies | 339 view(s)
  • Jon Sneyers's Avatar
    Today, 14:08
    The thing is, a bitstream needs to be suitable for parallel encode/decode. That is not always the case. Comparing using just 1 thread gives an unfair advantage to inherently sequential codecs. Typical machines have more than 4 cores nowadays. Even in phones, 8 is common. The tendency is towards more cores and not much faster cores. The ability to do parallel encode/decode is important.
    12 replies | 339 view(s)
  • skal's Avatar
    Today, 13:34
    -sharp_yuv is not default because it's slower: the default are adapted to the general common use, and the image you picked as source is not the common case the defaults are tuned for, far from. (all the more that these images are better compressed losslessly!) Just because you have 4 cores, doesn't mean you want to use them all at once. Especially if you have several images to compress in parallel (which is often the case). For making a point with a fair comparison, it would have been less noise to force 1 thread for all codecs. As presented, i find the text quite misleading.
    12 replies | 339 view(s)
  • Jon Sneyers's Avatar
    Today, 13:16
    Good point, yes, better results are probably possible for all codecs with custom encoder options. I used default options for all. ​Numbers are for 4 threads, as is mentioned in the blogpost. On single core, libjpeg-turbo will be faster. Using more than four cores, jxl will be more significantly faster. It's hard to find a CPU with fewer than 4 cores these days.
    12 replies | 339 view(s)
  • Jon Sneyers's Avatar
    Today, 13:07
    Correct. ​The article shows only that crop, but the sizes are for the whole image. Also, lossless WebP wouldn't be completely lossless since this is a 16-bit PNG (quantizing to 8-bit introduces very minor color banding).
    12 replies | 339 view(s)
  • Jon Sneyers's Avatar
    Today, 12:55
    ​Sorry, yes, drop the f_jpg,q_97 to get the actual original PNG.
    12 replies | 339 view(s)
  • Kirr's Avatar
    Today, 10:45
    Yeah, the two are a tool and a library implementing DEFLATE algorith, this is more accurate to say. In my benchmark, by "gzip" I refer to the software tool, not to the "gzip" file format. zlib has "zpipe.c" in "examples" directory. This may be what you mean. I guess there is no point testing it, but perhaps I should benchmark it to confirm this. It seems 7-Zip is still Windows-exclusive. However there is a more portable "p7zip" - I will think about adding it to the benchmark.
    4 replies | 240 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Today, 09:34
    @darek could you test paq8sk19 -x15 -w -e1,english.dic on enwik9 please ? thank you
    83 replies | 7689 view(s)
  • cssignet's Avatar
    Today, 09:13
    the host (https://i.slow.pics/) did some kind of post-processing on PNG (dropping the iCCP chunk and recompressing the image data worsely). those files are not what i uploaded (see edited link on my first post)
    12 replies | 339 view(s)
  • hunman's Avatar
    Today, 08:01
    hunman replied to a thread MCM + LZP in Data Compression
    Maybe you can integrate it into Bulat's FreeARC...
    53 replies | 35264 view(s)
  • Li weiqin's Avatar
    Today, 05:46
    Li weiqin replied to a thread MCM + LZP in Data Compression
    I've used this wonderful function for a year and wonder who made it. And I find this, thank you. But, it's hard to use for normal people like me for it can only run on cmd and compress a file per operation. If somebody can design a GUi or remake a graphic software, it will be great.
    53 replies | 35264 view(s)
  • SolidComp's Avatar
    Today, 03:25
    Your lossless reduction darkened the image though. Look at them side by side.
    12 replies | 339 view(s)
  • cssignet's Avatar
    Today, 02:19
    ​i guess the original PNG would be this: https://res.cloudinary.com/cloudinary-marketing/image/upload/Web_Assets/blog/high_fidelity.png some trials with close filesize (webp = no meta, png = meta): cwebp -q 91 high_fidelity.png -o q91.webp (52.81 KB) -> q91.png cwebp -q 90 -sharp_yuv high_fidelity.png -o q90-sharp.webp (52.06 KB) -> q90-sharp.png it would be unrelated with the point of the article itself, but still, since web delivery is mentionned, few points from end-user pov on samples/results: - about PNG itself, the encoder used here would make very over-bloated data for web context, making the initial filesize non-representative of the format (original PNG is 2542.12 KB, but expected rendering for web could be losslessly encoded to 227.08 KB with all chunks). as aside note, this PNG encoder also wrote non-standard key for zTxt/tEXt chunks or non-standard chunks (caNv) btw, instead of math lossless only, did you plan somehow to provide a "web lossless"? i did not try, but feeding the lossless (math) encoder with 16 bits/sample PNG would probably create over-bloated file for web usage
    12 replies | 339 view(s)
  • SvenBent's Avatar
    Today, 01:34
    Thank you for the testing i ran into some of the same issues with ECT. it appearsps ECT uses a lot higher number of blocks than pngout I reported this issue to caveman in his huffmixthread https://encode.su/threads/1313-Huffmix-a-PNGOUT-r-catalyst?p=65017&viewfull=1#post65017 Personally since Deflopt does never increase size I do not believe its has the biggest effect with huffmix but I can ECT + defltop /b mixed with ECT+defluffed+delftop /b, as defluff sometimes increases size. i wonder what the huffxmi succes rate is from ECT -9 with pngout /f6 /ks /kp /force on the ect file
    469 replies | 125681 view(s)
  • Shelwien's Avatar
    Yesterday, 23:30
    https://www.phoronix.com/scan.php?page=news_item&px=Torvalds-Threadripper Yes, but he just wanted more threads.
    1 replies | 68 view(s)
  • skal's Avatar
    Yesterday, 23:18
    Also: ​ * you forgot to use '-sharp_yuv' option for the webp example (53kb). Otherwise, it would have give you the quite sharper version: (and note that this webp was encoded from the jpeg-q97, not the original PNG). * in the "Computational Complexity", i'm very surprised that JPEG-XL is faster than libjpeg-turbo. Did you forget to mention multi-thread usage?
    12 replies | 339 view(s)
  • skal's Avatar
    Yesterday, 21:35
    Jon, your "Original PNG image (2.6 MB)"is actually a jpeg (https://res.cloudinary.com/cloudinary-marketing/image/upload/f_jpg,q_97/Web_Assets/blog/high_fidelity.png) when downloaded. did you mean to add 'f_jpg,q_97' in the URL ?
    12 replies | 339 view(s)
  • SolidComp's Avatar
    Yesterday, 21:19
    SolidComp replied to a thread Brotli in Data Compression
    Wow it shrunk jQuery down to 10 KB! That's impressive. The dictionary is 110 KB, but that's a one-time hit. There were error messages on dictionary creation though. I don't really understand them:
    255 replies | 82067 view(s)
  • Jon Sneyers's Avatar
    Yesterday, 20:30
    Hi everyone! I wrote a blog post about the current state of JPEG XL and how it compares to other state-of-the-art image codecs. https://cloudinary.com/blog/how_jpeg_xl_compares_to_other_image_codecs
    12 replies | 339 view(s)
  • SolidComp's Avatar
    Yesterday, 19:31
    "gzip" as such isn't a command line interface to the zlib library. It's just a format, one of three that zlib supports (the other two are raw DEFLATE and a "zlib" format, also DEFLATE-based). GNU gzip is just a specific app that produces gzip files (and maybe others?). I think zlib has a program that you can easily build. It might be called minizip. Someone please correct me if I'm wrong. The 7-Zip gzipper is unrelated to the .7z or LZMA formats. I'm speaking of 7-Zip the app. It can produce .7z, .xz, gzip (.gz), .zip, .bz2, and perhaps more compression formats. Pavlov wrote his own gzipper from scratch, apparently, and it's massively better than any other gzipper, like GNU gzip or libdeflate. I assume it's better than zlib's gzipper as well. I don't understand how he did it. So if you want to compare the state of the art to gzip, it would probably make sense to use the best gzipper. His gzip files are 17% smaller than libdeflate's on text...
    4 replies | 240 view(s)
  • Scope's Avatar
    Yesterday, 19:15
    Scope replied to a thread JPEG XL vs. AVIF in Data Compression
    How JPEG XL Compares to Other Image Codecs ​https://cloudinary.com/blog/how_jpeg_xl_compares_to_other_image_codecs
    15 replies | 687 view(s)
  • smjohn1's Avatar
    Yesterday, 19:12
    OK, that makes sense too. So reducing LZ4_DISTANCE_MAX doesn't necessary increases compression speed. That might be a sweet spot in terms of compression speed.
    4 replies | 189 view(s)
  • Cyan's Avatar
    Yesterday, 18:57
    In fast mode, finding more matches corresponds to effectively skipping more data and searching less, so it tends to be faster indeed.
    4 replies | 189 view(s)
  • smjohn1's Avatar
    Yesterday, 17:18
    You are right. Checked the code again, and memory use level was indeed 18 instead of 14. So that was the reason, which makes sense. On the other other hand, smaller LZ4_DISTANCE_MAX results in speed decrease ( though slightly ) in compression. Is that because literal processing ( memory copy ) is slower than match processing?
    4 replies | 189 view(s)
  • lz77's Avatar
    Yesterday, 10:47
    https://habr.com/ru/news/t/503658/ ​Sorry, in Russian.
    1 replies | 68 view(s)
  • Krishty's Avatar
    Yesterday, 10:23
    While huffmix works great with pngout /r, I had little success using it on combinations of ECT/DeflOpt/defluff. Details here: https://encode.su/threads/3186-Papa%E2%80%99s-Optimizer?p=65106#post65106 I should check whether there is a way to use ECT similar to pngout /r, i.e. whether block splits are stable with different parameters …
    469 replies | 125681 view(s)
  • Krishty's Avatar
    Yesterday, 09:54
    I did some experiments with huffmix according to this post by SvenBent: https://encode.su/threads/2274-ECT-an-file-optimizer-with-fast-zopfli-like-deflate-compression?p=64959&viewfull=1#post64959 (There is no public build because I haven’t gotten a response from caveman so far regarding the huffmix license.) I tested a few thousand PNGs from my hard drive. Previous optimization used ECT + defluff + DeflOpt; now it uses huffmix on all intermediate results. Some observations: Without pngout, huffmix has only three streams to choose from: ECT output, ECT + DeflOpt, ECT + defluff + DeflOpt. So there is not much gain to expect. Actual gains were seen in about one out of fifty files. These were mostly 1-B gains; one file got 7 B smaller and another 13 B. The larger the file, the larger the gains. The error rate increased significantly: “More than 1024 Deflate blocks detected, this is not handled by this version.” (known error with large PNGs) “Type 0 (uncompressed) block detected, this is not handled by this version.” (known error) On a few files, huffmix terminated without any error message There is a huge increase in complexity: Previously, there was just one pipeline for all DEFLATE-based formats. I have to maintain a separate path for ZIP now, which is supported by ECT/defluff/DeflOpt but not by huffmix. If huffmix exits with error code 0, all is well. Else if huffmix exits with error code 1, parse stdout: If stdout contains “Type 0 (uncompressed) block detected, this is not handled by this version.”, we have a soft error. Just pick the smaller file. Else if stdout contains “More than 1024 Deflate blocks detected, this is not handled by this version.”, we have a soft error. Just pick the smaller file. Else if stderr (not stdout!) contains “is an unknown file type”, we have hit a file that huffmix doesn’t understand. Just pick the smaller file. (This also happens with defluff and archives containing some Unicode characters.) Else we have a hard error like a destroyed file; abort. There is much more file I/O going on, and there seems to be a very rare race condition with Win32’s CopyFile and new processes. Not huffmix’s fault, but something that is now being triggered much more often. All in all, I’m not sure I should keep working on it. It definitely is a great tool, but it comes with so many limitations and rough edges that the few bytes it gains me over my existing pipeline hardly justify the increased complexity.
    80 replies | 20549 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 09:53
    Jyrki Alakuijala replied to a thread Brotli in Data Compression
    The same custom dictionary that zstd uses can be used for brotli. In my testing half the time Zstds custom dictionary builder wins Brotli's similar tool, half the time the opposite. Surprisingly often it is an even better strategy (for resulting compression density) to take random 10 samples of the data, concatenate them as a custom dictionary rather than trying to be smart about it.
    255 replies | 82067 view(s)
  • Cyan's Avatar
    Yesterday, 05:22
    Cyan replied to a thread Brotli in Data Compression
    I think 5 samples is the absolute minimum. Sometimes, even that is not enough, when samples are pretty small. But 90K is relatively large, so that should do it (assuming you are adding multiple copies of the same file, adding empty files wouldn't work). Looking at your screenshot, I noticed a wildcard character `*`. I have no idea how shell expansion works on Windows. Chances are, it doesn't. Prefer using the `-r` command to load all files from a directory, this capability is internal to `zstd` so it should be okay even on Windows, since it doesn't depend on any shell capability.
    255 replies | 82067 view(s)
  • Cyan's Avatar
    Yesterday, 05:14
    I don't get the same result. When testing with LZ4_DISTANCE_MAX == 32767, I get 57548126 for enwik8, aka slightly worse than a distance of 64KB. In order to get the same compressed size as you, I first need to increase the memory usage by quite a bit (from 16 KB to 256 KB), which is the actual reason for the compression ratio improvement (and compression speed decrease). The impact of MAX_DISTANCE is less dramatic than for high compression mode because, by the very nature of the fast mode, it doesn't have much time to search, so most searches will end up testing candidates at rather short distances anyway. But still, reducing max distance should nonetheless, on average, correspond to some loss of ratio, even if a small one.
    4 replies | 189 view(s)
  • Kirr's Avatar
    Yesterday, 03:24
    Thanks for kind words, SolidComp. I work with tons of biological data, which motivated me to first make a compressor for such data, and then this benchmark. I'll probably add FASTQ data in the future, if time allows. As for text, HTML, CSS and other data, I have no immediate plans for it. There are three main obstacles: 1. Computation capacity. 2. Selecting relevant data. 3. My time needed to work on it. Possibly it will require cooperating with other compression enthusiasts. I'll need to think about it. I'm under the impression that "zlib" is a compression library, and "gzip" is a command line interface to this same library. Since I benchmark command line compression tools, it's the "gzip" that is included, rather than "zlib". However please let me know if there is some alternative command line "zlib gzipper" that I am missing. Igor Pavlov's excellent LZMA algorithm (which powers 7-Zip) is represented by the "xz" compressor in the benchmark. Igor's unfortunate focus on Windows releases allowed "xz" to become standard LZMA implementation on Linux (as far as I understand). You mean this one - https://github.com/ebiggers/libdeflate ? Looks interesting, I'll take a look at it. I noticed this bit in the GitHub readme: "libdeflate itself is a library, but the following command-line programs which use this library are also provided: gzip (or gunzip), a program which mostly behaves like the standard equivalent, except that it does not yet have good streaming support and therefore does not yet support very large files" - Not supporting very large files sounds alarming. Especially without specifying what exactly they mean by "very large". Regarding gzip, don't get me started! Every single biological database shares data in gzipped form, wasting huge disk space and bandwidth. There is a metric ton of research on biological sequence compression, in addition to excellent general-purpose compressors. Yet the field remains stuck with gzip. I want to show that there are good alternatives to gzip, and that there are large benefits in switching. Whether this will have any effect remains to be seen. At least I migrated all my own data to a better format (saving space and increasing access speed).
    4 replies | 240 view(s)
  • smjohn1's Avatar
    Yesterday, 00:36
    README.md: `LZ4_DISTANCE_MAX` : control the maximum offset that the compressor will allow. Set to 65535 by default, which is the maximum value supported by lz4 format. Reducing maximum distance will reduce opportunities for LZ4 to find matches, hence will produce worse the compression ratio. The above is true for high compression modes, i.e., levels above 3, but the opposite is true for compression levels 1 and 2. Here is a test result using default value ( 65535 ): <TestData> lz4-v1.9.1 -b1 enwik8 1#enwik8 : 100000000 -> 57262281 (1.746), 325.6 MB/s ,2461.0 MB/s and result using a smaller value ( 32767 ): <TestData> lz4-1.9.1-32 -b1 enwik8 1#enwik8 : 100000000 -> 53005796 (1.887), 239.3 MB/s ,2301.1 MB/s Anything unusual in LZ4_compress_generic() implementation? Could anyone shed some light? Thanks in advance.
    4 replies | 189 view(s)
  • Shelwien's Avatar
    25th May 2020, 23:14
    Shelwien replied to a thread Paq8pxd dict in Data Compression
    I don't think so - there're bugfixes and various tweaks (mostly jpeg model), according to changelog. All changes should be included in v89. If you need something to test, why not test different ppmd parameters? https://github.com/kaitz/paq8pxd/blob/master/paq8pxd.cpp#L12013 These numbers there (12,6,210,64 etc) are somewhat random, so you can try increasing or decreasing them and check how it affects compression. (12 and 6 are PPM orders and 210,64 are memory allocation per ppmd instance).
    916 replies | 313474 view(s)
  • Darek's Avatar
    25th May 2020, 22:42
    Darek replied to a thread Paq8pxd dict in Data Compression
    Are there any changes worth to test v87 and v88?
    916 replies | 313474 view(s)
  • Shelwien's Avatar
    25th May 2020, 21:09
    > "shooting" hundreds of compressors a day That won't really work with gigabyte-sized datasets. At slowest allowed speeds it would take more than a hour to compress it. Number of attempts would be limited simply because of limited computing power (like 5 or so).
    10 replies | 658 view(s)
  • schnaader's Avatar
    25th May 2020, 20:10
    Another question that comes to my mind regarding a private dataset: will there be automation involved to get results quick? Because with a private dataset I imagine literally "shooting" hundreds of compressors a day using different dictionaries to analyze the data. So would this be a valid and working strategy? Alex' quote "organizers will provide some samples" points into the direction to reduce this a bit so you can also do offline using, but it would still be useful.
    10 replies | 658 view(s)
  • SolidComp's Avatar
    25th May 2020, 17:39
    Hi all – @Kirr made an incredibly powerful compression benchmark website called the Sequence Compression Benchmark. It lets you select a bunch of options and run it yourself, with outputs including graphs, column charts, and tables. It can run every single level of every compressor. The only limitation I see at this point is the lack of text datasets – it's mostly genetic data. @Kirr, four things: Broaden it to include text? Would that require a name change or ruin your vision for it? It would be great to see web-based text, like the HTML, CSS, and JS files of the 100 most popular websites for example. The gzipper you currently use is the GNU gzip utility program that comes with most Linux distributions. If you add some text datasets, especially web-derived ones, the zlib gzipper will make more sense than the GNU utility. That's the gzipper used by virtually all web servers. In my limited testing the 7-Zip gzipper is crazy good, so good that it approaches Zstd and brotli levels. It's long been known to be better than GNU gzip and zlib, but I didn't know it approached Zstd and brotli. It comes with the 7-Zip Windows utility released by Igor Pavlov. You might want to include it. libdeflate is worth a look. It's another gzipper. The overarching message here is that gzip ≠ gzip. There are many implementations, and the GNU gzip utility is likely among the worst.
    4 replies | 240 view(s)
  • SolidComp's Avatar
    25th May 2020, 17:20
    SolidComp replied to a thread Brotli in Data Compression
    Five times or five files? I added a fifth file, same error. Screenshot below:
    255 replies | 82067 view(s)
  • Shelwien's Avatar
    25th May 2020, 16:05
    Shelwien replied to a thread Brotli in Data Compression
    @SolidComp: Sorry, I left a mistake after renaming samples subdirectory :) After running gen.bat, the dictionary is in the file named "dictionary". If you're on linux, you can just repeat the operations in gen.bat manually, zstd --train produces the dictionary, zstd -D compresses using it. Then there's also this option to control dictionary size: --maxdict=# : limit dictionary to specified size (default: 112640)
    255 replies | 82067 view(s)
  • Cyan's Avatar
    25th May 2020, 15:25
    Cyan replied to a thread Zstandard in Data Compression
    --patch-from is a new capability designed to reduce the size of transmitted data when updating a file from one version to another. In this model, it is assumed that : - the old version is present at destination site - new and old versions are relatively similar, with only a handful of changes. If that's the case, the compression ratio will be ridiculously good. zstd will see the old version as a "dictionary" when generating the patch and when decompressing the new version. So it's not a new format : the patch is a regular zstd compressed file.
    429 replies | 130387 view(s)
  • Cyan's Avatar
    25th May 2020, 15:15
    Cyan replied to a thread Brotli in Data Compression
    You could try it 5 times. Assuming that the source file is ~90K, this should force the trainer to provide a dictionary from this material. Note though that the produced dictionary will be highly tuned for this specific file, which is not the target model. In production environment, we tend to use ~10K samples, randomly extracted from an even larger pool, in order to generate a dictionary for a category of documents.
    255 replies | 82067 view(s)
  • Jyrki Alakuijala's Avatar
    25th May 2020, 10:17
    Jyrki Alakuijala replied to a thread Brotli in Data Compression
    Based on a quick look at the makefiles, we are not using the fast math option. However, there can be more uncertainty, like perhaps using multiply-and-add as a single instruction leading to a different result that doing multiply and add as two separate instructions. (I'm a bit out of touch with this field. Compilers, vectorization and new instructions are improved constantly.)
    255 replies | 82067 view(s)
  • Kirr's Avatar
    25th May 2020, 05:58
    Kirr replied to a thread Zstandard in Data Compression
    From source, when possible. Thanks, will clarify it on website (in the next update).
    429 replies | 130387 view(s)
  • SolidComp's Avatar
    25th May 2020, 05:16
    SolidComp replied to a thread Zstandard in Data Compression
    Do you build the compressors from source, or do you use the builds provided by the projects?
    429 replies | 130387 view(s)
  • SolidComp's Avatar
    25th May 2020, 05:11
    SolidComp replied to a thread Brotli in Data Compression
    Do you specify fast math in the makefile or cmake?
    255 replies | 82067 view(s)
  • SolidComp's Avatar
    25th May 2020, 05:11
    SolidComp replied to a thread Brotli in Data Compression
    Where's the dictionary?
    255 replies | 82067 view(s)
  • Kirr's Avatar
    25th May 2020, 02:59
    Kirr replied to a thread Zstandard in Data Compression
    zstd is now updated to 1.4.5 in my benchmark: http://kirr.dyndns.org/sequence-compression-benchmark/ I noticed good improvement in decompression speed for all levels, and some improvement in compression speed for slower levels. (Though I am updating from 1.4.0, so the improvement may be larger than from 1.4.4).
    429 replies | 130387 view(s)
  • redrabbit's Avatar
    25th May 2020, 02:09
    Thanks for the explanation and the testing
    84 replies | 13178 view(s)
  • terrelln's Avatar
    25th May 2020, 01:55
    terrelln replied to a thread Zstandard in Data Compression
    Both single-thread and multi-thread modes are deterministic, but they produce different results. Multi-threaded compression produces the same output with any number of threads. The zstd cli defaults to multi-threaded compression with 1 worker thread. You can opt into single-thread compression with --single-thread.
    429 replies | 130387 view(s)
  • schnaader's Avatar
    24th May 2020, 23:47
    OK, so here's the long answer. I could reproduce the bad performance of the Life is Strange 2 testfile, my results are in the table below. There are two things this all boils down to: preflate (vs. zlib brute force in xtool and Precomp 0.4.6) and multithreading. Note that both the Life is Strange 2 times and the decompressed size are very similar for Precomp 0.4.6 and xtool when considering the multithreading factor (computed by using the time command and dividing "user" time by "real" time). Also note that the testfile has many small streams (64 KB decompressed each), preflate doesn't seem to use its multithreading in that case. Although preflate can be slower than zlib brute force, it also has big advantages which can be seen when looking at the Eternal Castle testfile. It consists of big PNG files, preflate can make use of multithreading (though not fully utilizing all cores) and is faster than the zlib brute force. And the zlib brute force doesn't even manage to recompress any of the PNG files. Xtool's (using reflate) decompressed size is somewhere between those two, most likely because reflate doesn't parse multi PNG files and can only decompress parts of them because of this. So, enough explanation, how can the problem be solved? Multithreading, of course. The current branch already features multithreading for JPEG when using -r and I'm working on it for deflate streams. When it's done, I'll post fresh results for the Life is Strange 2 testfile, should be very close to xtool if things work out well. Multithreaded -cn or -cl though is a bit more complex, I've got some ideas, but have to test them and it will take longer. Test system: Hetzner vServer CPX21: AMD Epyc, 3 cores @ 2.5 GHz, Ubuntu 20.04 64-Bit Eternal Castle testfile, 223,699,564 bytes program decompressed size time (decompression/recompression) multithreading factor (decompression/recompression) compressed size (-nl) --- Precomp 0.4.8dev -cn -d0 5,179,454,907 5 min 31 s / 4 min 45 s 1.73 / 1.64 118,917,128 Precomp 0.4.6 -cn -d0 223,699,589 8 min 31 s 1.00 173,364,804 xtool (redrabbit's result) 2,099,419,005 Life is Strange 2 testfile, 632,785,771 bytes program decompressed size time (decompression/recompression) multithreading factor (decompression/recompression) --- Precomp 0.4.8dev -cn -intense0 -d0 1,499,226,364 3 min 21 s / 2 min 14 s 0.91 / 0.99 Precomp 0.4.8dev (after tempfile fix) 1,499,226,364 3 min 11 s / 2 min 21 s 0.92 / 0.99 Precomp 0.4.6 -cn -intense0 -d0 1,497,904,244 1 min 55 s / 1 min 43 s 0.93 / 0.98 xtool 0.9 e:precomp:32mb,t4:zlib (Wine) 1,497,672,882 46 s / 36 s 2.75 / 2.87
    84 replies | 13178 view(s)
  • Jyrki Alakuijala's Avatar
    24th May 2020, 23:36
    Jyrki Alakuijala replied to a thread Brotli in Data Compression
    I don't remember what was improved. Perhaps hashing. There is a lot of floating point in brotli 10 and 11. I don't know how compiler invariant it is the way we do it. I'd assume it to be well-defined, but this could be a naive position.
    255 replies | 82067 view(s)
  • Shelwien's Avatar
    24th May 2020, 21:20
    Shelwien replied to a thread Brotli in Data Compression
    Normally it just needs more data for training. But here I made a workaround for you: Update: attachment deleted since there was a mistake in the script and it didn't work.
    255 replies | 82067 view(s)
  • schnaader's Avatar
    24th May 2020, 20:44
    @redrabbit: Just a quick post to let you know I'm currently in researching the bad performance you describe on those 2 files and stuff I found out while profiling the latest Precomp version. Don't waste too much time on this. As I saw your post, I wondered about where zlib is still used as preflate is doing all deflate related stuff itself. But zlib is indeed still used to speed up intense and brute mode (testing the first few bytes of a potential stream to avoid recompressing false positives). But profiling the latest version shows that for the Life Is Strange 2 file you posted, this is only using 0.3% of CPU time (of .pak -> .pcf, in -r zlib isn't used at all). So using a faster zlib library could only speed up things by 0.3%. On the other hand, I found something else and fixed it some minutes ago in both branches: About 5% of CPU time was wasted because uncompressed data was written to a file to prepare recursion even though both "-intense0" and "-d0" disable recursion, so the temporary file wasn't used at all. Fixed this by writing the file only if it's used. Testing this shows it works: 3 min 11 s instead of 3 min 21 s for "-cn -intense0 -d0" of the Life Is Strange 2 file. Not much, but some progress. Might have more impact on non-SSD drives.
    84 replies | 13178 view(s)
  • SolidComp's Avatar
    24th May 2020, 20:36
    SolidComp replied to a thread Brotli in Data Compression
    I tried to create a dictionary with --train, but I get an error saying not enough samples or something. I tried it with just the jQuery file as the training sample, which used to work in the past. Then I tried two, then three, then four jQuery files (the previous versions, 3.5.0, 3.4.1, etc.), and still get the error even with four files. Not sure what I'm doing wrong.
    255 replies | 82067 view(s)
  • Hakan Abbas's Avatar
    24th May 2020, 19:59
    In data compression, working speed is as important as compression rate. When looking at the sector, it is clearly seen that faster products are preferred even if they are low in efficiency. It is not preferable to spend much more energy than necessary for a small data saving. No matter who is behind the product. In most cases, while performing the calculations, the cost (cpu, ram ...) of these transactions is taken into consideration. Whenever possible, situations requiring excessive processing load are avoided. However, as we have seen, it cannot be said that attention is paid to these points for AVIF/AV1.
    15 replies | 687 view(s)
  • Shelwien's Avatar
    24th May 2020, 19:58
    Shelwien replied to a thread Brotli in Data Compression
    Its unfair comparison, since both brotli and zstd have support for external dictionary, but brotli silently uses integrated one 89,476 jquery-3.5.1.min.js 27,959 jquery-3.5.1.min.bro // brotli_gc82.exe -q 11 -fo jquery-3.5.1.min.bro jquery-3.5.1.min.js 28.754 jquery-3.5.1.min.bro // brotli with dictionary zeroed out in .exe 29,453 jquery-3.5.1.min.zst // zstd.exe --ultra -22 -fo jquery-3.5.1.min.zst jquery-3.5.1.min.js 28,942 jquery-3.5.1.min.zst // zstd.exe --ultra -22 -fo jquery-3.5.1.min.zst -D dictionary.bin jquery-3.5.1.min.js This example uses brotli's default dictionary for zstd, but we can generate a custom dictionary for zstd, while it is harder to do for brotli. Yes, brotli at max settings has stronger entropy model than zstd. But it also has 5x slower encoding and 2x slower decoding. And actual compression difference is still uncertain, since we can build a larger specialized dictionary for target data with zstd --train.
    255 replies | 82067 view(s)
  • Jyrki Alakuijala's Avatar
    24th May 2020, 19:40
    My basic understanding is that avif decoders are roughly half the speed of jpeg xl decoders. Getting the fastest avif decoding may require turning off some features that are always on in jpeg xl, for example no yuv444, no more than 8 bit of dynamics. It may be that hardware decoders for avif may not be able to do streaming, i.e., to display that part of the image that is already decoded. For some applications this can be a blocker.
    15 replies | 687 view(s)
  • Jyrki Alakuijala's Avatar
    24th May 2020, 19:30
    Pik is a superfast simple format for photography level qualities (1.5+ bpp), and gives great quality/density there. It used dct8x8. The requirements for JPEG XL included lower rates, down to 0.06 bpp was discussed. Variable sizes of dcts and filtering improve the performance at lower bpps, and keep it at state-of-the art down to 0.4 bpp and somewhat decent at 0.15 bpp. This was adding a lot of code and 2x decoding slowdown to cover a larger bpp range. Further, we integrated FUIF into PIK. We are still in the process of figuring out all the possibilities it brings. FUIF seems much less psycho usually efficient, but more versatile coding. Luca and Jon developed FUIF code further after integrating it into JPEG XL. Jon has been in general great to collaborate with and I am quite proud for having made the initial proposals of basing JPEG XL on these two codecs. Everyone in the team has grown very comfortable with the fusion of these two codecs.
    15 replies | 687 view(s)
  • Jyrki Alakuijala's Avatar
    24th May 2020, 19:16
    I'm not an expert on jpeg xt. Xt is likely a HDR patch on usual jpegs. Not particularly effective at compression density.
    15 replies | 687 view(s)
  • LucaBiondi's Avatar
    24th May 2020, 19:07
    LucaBiondi replied to a thread Paq8pxd dict in Data Compression
    Thank you!!
    916 replies | 313474 view(s)
  • DZgas's Avatar
    24th May 2020, 18:36
    DZgas replied to a thread JPEG XL vs. AVIF in Data Compression
    I think - Large corporations did AV1 codec for self, not for users haha. But decoding with the standard Dav1d is normal...if fact that my laptop can't VP9 1080p and AV1 720p and on the verge can HEVC 1080p. Problems of progress - I am slow. Obviously AV1 is slowest codec and the strongest of all. Encoding speeds are currently killing him. Despite the fact that it is ~ 25% better than HEVC or VP9, ​​it is 10-20 times slower, this is serious for people who do not have powerful PC/Server. Well, almost the Internet is still using the old AVC, because it’s fast and you can decode on Watch.
    15 replies | 687 view(s)
  • SolidComp's Avatar
    24th May 2020, 17:25
    SolidComp replied to a thread Zstandard in Data Compression
    Sportman, why is the compressed size different for single thread vs multithreaded? Is it supposed to produce different results? I thought it would be deterministic at any given compression level.
    429 replies | 130387 view(s)
  • SolidComp's Avatar
    24th May 2020, 17:19
    Yes, AVIF was not intended for cameras. The encoders are still incredibly slow, as are the encoders for AV1 video (though I think Intel's SVT AV1 encoder is improving). Do you know if the decoders are reasonably fast?
    15 replies | 687 view(s)
  • SolidComp's Avatar
    24th May 2020, 17:18
    Jyrki, is either JPEG XL or AVIF related to PIK? What became of PIK? And I'm confused by JPEG XT – do you know if it's related to XL?
    15 replies | 687 view(s)
  • SolidComp's Avatar
    24th May 2020, 17:02
    SolidComp replied to a thread Brotli in Data Compression
    Hi all – I'm impressed with the results of compressing jQuery with brotli, compared to Zstd and libdeflate (gzip): Original jQuery 3.5.1 (latest): 89,476 bytes (this is the minified production version from jQuery.com: Link) libdeflate 1.6 gzip -11: 36,043 (libdeflate adds two extra compression levels to zlib gzip's nine) Zstd 1.4.5 -22: 29,453 brotli 1.0.4 -11: 28,007 brotli 1.0.7 -11: 27,954 Update: 7-Zip's gzipper is incredible: 29,960 bytes. I'm not sure why it's so much better than libdeflate, or how it's so close to Zstd and brotli. Compression of web files is much more important to me than the weird benchmarks that are typically used. And this is where brotli shines, not surprisingly since it was designed for the web. Note that brotli has a dictionary, generated from web files, whereas Zstd and libdeflate do not. You can generate a dictionary with Zstd, but it keeps giving me an error saying there aren't enough samples... Brotli 1.0.7 performs slightly better than 1.0.4, which was surprising since there was nothing in the release notes that indicated improvements for the max compression setting (-11), just an improvement for the -1 setting. The only other difference is that I compiled my 1.0.7 version myself in Visual Studio 2019, dynamically linked, whereas my 1.0.4 version is the official build from GitHub, a static executable (1.0.4 is the last version they released builds for – all they've released is source code for 1.0.5 through 1.0.7). Jyrki, should compression results (size) be exactly the same across compilers, deterministic given the same settings? So it has to be something in the source code of 1.0.7 compared to 1.0.4 that explains the improvement, right, not Visual Studio?
    255 replies | 82067 view(s)
  • moisesmcardona's Avatar
    24th May 2020, 16:30
    The commits are there. Kaitz just didn't posted the compiled versions: v87: https://github.com/kaitz/paq8pxd/commit/86969e4174f8f3f801f9a0d94d36a8cbda783961 v88: https://github.com/kaitz/paq8pxd/commit/7969cc107116c31cd997f37359b433994fea1f6d I've attached them with the source from their respective commits. Compiled with march=native on my AMD Ryzen 9 CPU.
    916 replies | 313474 view(s)
  • Jyrki Alakuijala's Avatar
    24th May 2020, 00:19
    Jpeg XL is not frozen yet. (Other than the jpeg repacking part of it.) Our freezing schedule is end of August 2020. Before that it is not a good idea to integrate for other than testing use.
    15 replies | 687 view(s)
More Activity