Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • Jyrki Alakuijala's Avatar
    Today, 10:27
    Jyrki Alakuijala replied to a thread Brotli in Data Compression
    This is likely a misunderstanding. Brotli can use the same linear dictionaries used in zstd and the same tooling. The dictionary mechanisms with simple grammar is in addition to that, but ordinary linear dictionaries can be used. One just gets a bit less benefit from them (but not less than zstd gets from these simple dictionaries). Zstd does not yet support transforms on dictionaries as far as I know.
    256 replies | 82160 view(s)
  • Jyrki Alakuijala's Avatar
    Today, 10:13
    Could you file an issue either on jpegxl GitLab or on brunsli GitHub repo and we will look at it, and make them not differ. We have run this with lots of filed successfully so I doubt that this is either a special corner case or more likely a recent bug. We are currently converting these compressors into more streaming operation and to more easily streamable apis, and this bug might have come from that effort. Thank you in advance!
    154 replies | 36158 view(s)
  • Shelwien's Avatar
    Today, 06:38
    Adding decompressor size requires absurd data sizes to avoid exploits (for 1GB dataset, compressed zstd size is still ~0.1% of total result) Otherwise the contest can turn into decoder size optimization contest, if intermediate 1st place is open-source. Also Alex pushes for a mixed dataset (part public, part private, with uncertain shares), but I think that it just combines negatives of both options (overtuning still possible on public part, decoder size still necessary to avoid exploits, compressed size of secret part still not 100% predictable in advance).
    12 replies | 723 view(s)
  • SvenBent's Avatar
    Today, 04:49
    I cant vote but i would vote private/secret The public dataset Encourage over tuning which is not really helpfull or a show of genreal compression rate. in the realworld the compressor does not know the data ahead of compression time. I would still add size+ de compressor though
    12 replies | 723 view(s)
  • Cyan's Avatar
    Today, 04:09
    Cyan replied to a thread Zstandard in Data Compression
    So far, we have only thoroughly compared with bsdiff We can certainly extend the comparison to more products, to get a more complete picture. MT support for --patch-from works just fine. In term of positioning, zstd is trying to bring speed to formula : fast generation of patches, fast application of patches. There are use cases which need speed and will like this trade-off, compared to more established solutions which tend to be less flexible in term of range of speed. At this stage, we don't try to claim "best" patch size. There are a few scenarios where zstd can be quite competitive, but that's not always the case. This first release will hopefully help us understand what are users's expectations, in order to select the next batch of improvements. This is a new territory for us, there is still plenty of room for improvements, both feature and performance wise. One unclear aspect to me is how much benefit could achieve a dedicated diff engine (as opposed to recycling our "normal" search engine) while preserving the zstd format. There are, most likely, some limitations introduced by the format, since it wasn't created with this purpose in mind. But how much comes from the format, as opposed to the engine ? This part is unclear to me. Currently, I suspect that the most important limitations come from the engine, hence better patch sizes should be possible.
    432 replies | 130538 view(s)
  • Shelwien's Avatar
    Today, 02:42
    Shelwien replied to a thread Zstandard in Data Compression
    I asked FitGirl to test it... got this: 1056507088 d2_game2_003.00 // (1) game data 1383948734 d2_game2_003.resources // (2) precomp output 327523769 d2_game2_003.resources.x5 // xdelta -5 245798553 d2_game2_003.resources.x5.8 // compressed 278021923 d2_game2_003.resources.zsp // zstd -patch 247363158 d2_game2_003.resources.zsp.8 // compressed Speed-wise zstd patching seems good, but it has a 2G window limit, MT support for this is unknown, and overall specialized patchers seem to work better.
    432 replies | 130538 view(s)
  • Bulat Ziganshin's Avatar
    Today, 01:53
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    Cyan, do you compared it against xdelta and similar algos?
    432 replies | 130538 view(s)
  • skal's Avatar
    Yesterday, 15:36
    The trick is to put the index ("jump table") in the COM section reserved for comments. Mildly non-standard JPEGs, but workable.
    14 replies | 585 view(s)
  • Kirr's Avatar
    Yesterday, 13:49
    Thanks James, looks like I'll have to add libdeflate soon. I'm still worrying about it's gzip replacement not supporting large files. I guess I'll see what they mean.
    5 replies | 351 view(s)
  • pklat's Avatar
    Yesterday, 13:12
    I've tried lossless jpeg repack again with update jpegxl and some other image. looks normal now, it didn't skew it, but its not rotated so presumably something is lost (metadata?). is there some option for (d)encoder to unpack to original besides '-j'? I've tried zero-padding the start of unpacked file and comparing it to original, but couldn't find any match. also converted both to .bmp but they differ.
    154 replies | 36158 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 12:45
    There are two separate optimization goals: S1. what is best for users, and S2. what is easiest for webmaster (or other deployer of compression) to run business as usual. For S1. we need to look at the whole information system and its efficiency and cost factors. Often we can find out by using simple economic modeling that the cost of having user time and attention is 1000x more value than the cost of computer. This is understandable in the light of computers and the energy consumed being a value of $1000 and a human having a value that trumps $1000000. Users can be blocked by two different things related to compression: data transfer payload size and excessive coding resource use, decoding speed, and rarely encoding, too. Current compression methods in general are not spending enough CPU and memory in order to fully optimize for S1. That is because people also optimize for the S2. For S2. People often consider compression separately, outside of the larger information processing system. Then they run a profiler, and see % values. Many engineers are willing to save 20 % of cpu speed while losing 5 % of density. The %-to-% comparison seems superficially like oranges-to-oranges comparison. They may however lack the knowledge that only the density modulates the user experienced speed, and that that cost is 1000x more than the cpu cost. They are able to transfer cost from their company to their clients. Also, these engineers may not have access even to data transfer costs, so that they could even do a pure cost based optimization with disregard to users. However, ignoring the users will cost the company revenue and growth. Saving a few thousand in compression cpu use can lead to a double digit revenue drop for a big e-commerce company. I often see that junior engineers are very keen on the profile-based optimization and try to switch to faster compression algorithms that are only a few percent worse, and the more senior engineers with a holistic system point of view stop them -- or ask them to run an experiment to show that there is no negative effect on conversions/revenue/returning users etc. For naive positioning based on S2, we will see many webservers configured with no compression or the ineffective gzip quality 1 compression.
    16 replies | 803 view(s)
  • Jon Sneyers's Avatar
    Yesterday, 11:59
    I wouldn't say sequential codecs are more efficient than parallel ones: you can always just use only a single thread, and avoid the of course unavoidably imperfect parallel scaling. If you have enough images to process at the same time (like Cloudinary, or maybe rendering a website with lots of similar-sized images), you can indeed best just parallelize that way and use a single thread per image. There are still cases where you don't have enough images to process in parallel single-thread processes to keep your cores busy though. For end-users, I think the "Photoshop case" is probably a rather common case. Restart markers in JPEG only allow you to do parallel encode, not parallel decode. A decoder doesn't know if and where the next restart marker occurs, and what part of the image data it represents. You can also only do stripes with restart markers, not tiles. So even if you'd add some custom metadata to make an index of restart marker bitstream/image offsets, it would only help to do full-image parallel decode, not cropped decode (e.g. decoding just a 1000x1000 region from a gigapixel image). I don't think the fact that no-one is trying to do this is telling. Applications that need efficient parallel/cropped decode (e.g. medical imaging) just don't use JPEG, but e.g. JPEG 2000. Multiplying the JPEG numbers by 4 doesn't make much sense, because you can't actually decode a JPEG 4x faster on 4 cores than one 1 core. Dividing the JPEG XL numbers by 3 (for decode) and by 2 (for encode) is what you need to do to get "fair" numbers: that's the speed you would get on single-core (the factor is not 4 because parallel scalability is never perfect). There's a reason why all the HEIC files produced by Apple devices are using completely independently encoded 256x256 tiles. Otherwise encode and decode would probably be too slow. The internal grid boundary artifacts are a problem in this approach though.
    14 replies | 585 view(s)
  • Shelwien's Avatar
    Yesterday, 02:59
    Shelwien replied to a thread Paq8pxd dict in Data Compression
    Works like this for me. You only need to compile one cpp file - pxd.cpp itself. It already includes everything else, except for zlib.
    921 replies | 314115 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 01:10
    Inside paq8sk19 archive there is g.bat to compile it ,just rename the cpp file https://encode.su/threads/3371-Paq8sk/page3
    921 replies | 314115 view(s)
  • LucaBiondi's Avatar
    Yesterday, 01:00
    LucaBiondi replied to a thread Paq8pxd dict in Data Compression
    Hi all, I am trying to compile latest sources of PAQ8PXD. I am using the script below that i used to compile PA8PX but i get errors.. I am not good with c++ Could some help me? Could some give me a script ? Thank you Luca This is the script: inside zlist i put: The error are... Thank you as usual! Luca
    921 replies | 314115 view(s)
  • LucaBiondi's Avatar
    27th May 2020, 23:50
    LucaBiondi replied to a thread Paq8pxd dict in Data Compression
    Thank you! Probably could be useful store parameters into a text file (so ...no file and you will have standard behaviour) or again better have the possibilities to set them from the command line. Doing so we can avoid to recompile the sources every time. Luca
    921 replies | 314115 view(s)
  • Shelwien's Avatar
    27th May 2020, 21:24
    Shelwien replied to a thread Paq8pxd dict in Data Compression
    > Do you know if there is a memory limitation allocating more than 1680 MB or 2000 MB? Afaik it should support up to 4095MB on x64. It uses 32-bit pointers for the tree, so no more than 4GB certainly, but up to that it should work. There's also a third parameter which controls whether ppmd resets statistics when its memory ends 0 = full reset, 1 = tree reduction which leaves 75% of stats. In fact this is also a tunable parameter: https://github.com/kaitz/paq8pxd/blob/master/mod_ppmd.inc#L727 (3*(M>>2))=3*M/4=75%. You can edit it to "GetUsedMemory()>96*(SubAllocatorSize>>7)", then try adjusting the coef (1..127 instead of 96 here).
    921 replies | 314115 view(s)
  • JamesB's Avatar
    27th May 2020, 17:51
    I think Libdeflate is the fastest tool out there right now, unless limiting to light-weight "level 1" style in which case maybe libslz wins out. We integrated libdeflate support into Samtools; for (de)compression of sequencing alignment data in the BAM format. I suspect this is the cause of libdeflate becoming an official Ubuntu package as Samtools/htslib have it as a dependency. I recently retested several deflater algorithms on enwik8. Tool Encode Decode Size ------------------------------------------ vanilla 0m5.003s 0m0.517s 36548933 intel 0m3.057s 0m0.503s 36951028 cloudflare 0m2.492s 0m0.443s 36511793 jtkukunas 0m2.956s 0m0.357s 36950998 ng 0m2.022s 0m0.377s 36881293 zstd (gz-6) 0m4.674s 0m0.468s 36548933 libdeflate 0m1.769s 0m0.229s 36648336 Note the file sizes fluctuate a bit. That's within the difference between gzip -5 vs -6 so arguably you'd include that in the time difference too. I also tried them at level 1 compression: Tool Encode Decode Size ------------------------------------------ vanilla 0m1.851s 0m0.546s 42298786 intel 0m0.866s 0m0.524s 56046821 cloudflare 0m1.163s 0m0.470s 40867185 jtkukunas 0m1.329s 0m0.392s 40867185 ng 0m0.913s 0m0.397s 56045984 zstd (gz) 0m1.764s 0m0.475s 42298786 libdeflate 0m1.024s 0m0.235s 39597396 Level 1 is curious as you can see very much how different versions have traded off the encoder algorithm speed vs size efficiency, with cloudflare and jtkukunas apparently using the same algorithm and intel/ng likewise. Libdeflate is no longer the fastest here, but it's not far off and is the smallest so it's in a sweet spot. And for fun, level 9: Tool Encode Decode Size ------------------------------------------ vanilla 0m6.113s 0m0.516s 36475804 intel 0m5.153s 0m0.516s 36475794 cloudflare 0m2.787s 0m0.442s 36470203 jtkukunas 0m5.034s 0m0.365s 36475794 ng 0m2.872s 0m0.371s 36470203 zstd (gz) 0m5.702s 0m0.467s 36475804 libdeflate 0m9.124s 0m0.237s 35197159 All remarkably similar sizes, bar libdeflate which took longer but squashed it considerably more. Libdeflate actually goes up to -12, but it's not a good tradeoff on this file: libdeflate 0m14.660s 0m0.236s 35100586 Edit: I tested 7z max too, but it was comparable to libdeflate max and much slower.
    5 replies | 351 view(s)
  • LucaBiondi's Avatar
    27th May 2020, 17:10
    LucaBiondi replied to a thread Paq8pxd dict in Data Compression
    Hi Shelwien, About testing parameters. for ppmd_12_256_1 level 8 order 12 memory 210 MB level 9 order 16 memory 420 MB level 10 order 16 memory 840 MB level 11 order 16 memory 1680 MB level 12 order 16 memory 1680 MB level 13 order 16 memory 1680 MB level 14 order 16 memory 1680 MB level 15 order 16 memory 1680 MB for ppmd_6_64_2 order 6 memory 64 MB (for each level) Do you know if there is a memory limitation allocating more than 1680 MB or 2000 MB? Thank you, Luca
    921 replies | 314115 view(s)
  • cssignet's Avatar
    27th May 2020, 15:24
    oops, sorry then (i removed the wrong statement), i have to admit that i did not check results here :). i just did few trials and meet some situations where some PNG would be smaller stored as lossless with JXL (instead of lossy), thought it was one of them my observations were about web usage context only, and how 16 bits/sample PNG are rendered in web browsers anyway
    14 replies | 585 view(s)
  • skal's Avatar
    27th May 2020, 15:03
    And yet, sequential codecs are more efficient than parallel ones: tile-based compression have sync points and contention that makes the codec wait for threads to finish. Processing several images separately in parallel doesn't have this inefficiency (providing memory and I/O is not the bottleneck). Actually, sequential codecs are at advantage in some quite important cases: * image burst on phone camera (sensors is taking a sequence of photos in short bursts) * web page rendering (which contains a lot of images, usually. Think YouTube landing page.) * displaying photo albums (/thumbnails) * back-end processing of a lot of photos in parallel (cloudinary?) Actually, I'd say parallel codec are mostly useful for the Photoshop case (where you're using one photo only) and screen sharing (/slide deck). side note: jpeg can be made parallelizable using Restart Markers. Fact that no-one is using it is somewhat telling. In any case, i would have multiplied the JPEG's MP/s by 4x in your table to get fair numbers.
    14 replies | 585 view(s)
  • Jon Sneyers's Avatar
    27th May 2020, 14:08
    The thing is, a bitstream needs to be suitable for parallel encode/decode. That is not always the case. Comparing using just 1 thread gives an unfair advantage to inherently sequential codecs. Typical machines have more than 4 cores nowadays. Even in phones, 8 is common. The tendency is towards more cores and not much faster cores. The ability to do parallel encode/decode is important.
    14 replies | 585 view(s)
  • skal's Avatar
    27th May 2020, 13:34
    -sharp_yuv is not default because it's slower: the default are adapted to the general common use, and the image you picked as source is not the common case the defaults are tuned for, far from. (all the more that these images are better compressed losslessly!) Just because you have 4 cores, doesn't mean you want to use them all at once. Especially if you have several images to compress in parallel (which is often the case). For making a point with a fair comparison, it would have been less noise to force 1 thread for all codecs. As presented, i find the text quite misleading.
    14 replies | 585 view(s)
  • Jon Sneyers's Avatar
    27th May 2020, 13:16
    Good point, yes, better results are probably possible for all codecs with custom encoder options. I used default options for all. ​Numbers are for 4 threads, as is mentioned in the blogpost. On single core, libjpeg-turbo will be faster. Using more than four cores, jxl will be more significantly faster. It's hard to find a CPU with fewer than 4 cores these days.
    14 replies | 585 view(s)
  • Jon Sneyers's Avatar
    27th May 2020, 13:07
    Correct. ​The article shows only that crop, but the sizes are for the whole image. Also, lossless WebP wouldn't be completely lossless since this is a 16-bit PNG (quantizing to 8-bit introduces very minor color banding).
    14 replies | 585 view(s)
  • Jon Sneyers's Avatar
    27th May 2020, 12:55
    ​Sorry, yes, drop the f_jpg,q_97 to get the actual original PNG.
    14 replies | 585 view(s)
  • Kirr's Avatar
    27th May 2020, 10:45
    Yeah, the two are a tool and a library implementing DEFLATE algorith, this is more accurate to say. In my benchmark, by "gzip" I refer to the software tool, not to the "gzip" file format. zlib has "zpipe.c" in "examples" directory. This may be what you mean. I guess there is no point testing it, but perhaps I should benchmark it to confirm this. It seems 7-Zip is still Windows-exclusive. However there is a more portable "p7zip" - I will think about adding it to the benchmark.
    5 replies | 351 view(s)
  • suryakandau@yahoo.co.id's Avatar
    27th May 2020, 09:34
    @darek could you test paq8sk19 -x15 -w -e1,english.dic on enwik9 please ? thank you
    83 replies | 7829 view(s)
  • cssignet's Avatar
    27th May 2020, 09:13
    the host (https://i.slow.pics/) did some kind of post-processing on PNG (dropping the iCCP chunk and recompressing the image data worsely). those files are not what i uploaded (see edited link on my first post)
    14 replies | 585 view(s)
  • hunman's Avatar
    27th May 2020, 08:01
    hunman replied to a thread MCM + LZP in Data Compression
    Maybe you can integrate it into Bulat's FreeARC...
    53 replies | 35330 view(s)
  • Li weiqin's Avatar
    27th May 2020, 05:46
    Li weiqin replied to a thread MCM + LZP in Data Compression
    I've used this wonderful function for a year and wonder who made it. And I find this, thank you. But, it's hard to use for normal people like me for it can only run on cmd and compress a file per operation. If somebody can design a GUi or remake a graphic software, it will be great.
    53 replies | 35330 view(s)
  • SolidComp's Avatar
    27th May 2020, 03:25
    Your lossless reduction darkened the image though. Look at them side by side.
    14 replies | 585 view(s)
  • cssignet's Avatar
    27th May 2020, 02:19
    ​i guess the original PNG would be this: https://res.cloudinary.com/cloudinary-marketing/image/upload/Web_Assets/blog/high_fidelity.png some trials with close filesize (webp = no meta, png = meta): cwebp -q 91 high_fidelity.png -o q91.webp (52.81 KB) -> q91.png cwebp -q 90 -sharp_yuv high_fidelity.png -o q90-sharp.webp (52.06 KB) -> q90-sharp.png it would be unrelated with the point of the article itself, but still, since web delivery is mentionned, few points from end-user pov on samples/results: - about PNG itself, the encoder used here would make very over-bloated data for web context, making the initial filesize non-representative of the format (original PNG is 2542.12 KB, but expected rendering for web could be losslessly encoded to 227.08 KB with all chunks). as aside note, this PNG encoder also wrote non-standard key for zTxt/tEXt chunks or non-standard chunks (caNv) btw, instead of math lossless only, did you plan somehow to provide a "web lossless"? i did not try, but feeding the lossless (math) encoder with 16 bits/sample PNG would probably create over-bloated file for web usage
    14 replies | 585 view(s)
  • SvenBent's Avatar
    27th May 2020, 01:34
    Thank you for the testing i ran into some of the same issues with ECT. it appearsps ECT uses a lot higher number of blocks than pngout I reported this issue to caveman in his huffmixthread https://encode.su/threads/1313-Huffmix-a-PNGOUT-r-catalyst?p=65017&viewfull=1#post65017 Personally since Deflopt does never increase size I do not believe its has the biggest effect with huffmix but I can ECT + defltop /b mixed with ECT+defluffed+delftop /b, as defluff sometimes increases size. i wonder what the huffxmi succes rate is from ECT -9 with pngout /f6 /ks /kp /force on the ect file
    469 replies | 125789 view(s)
  • Shelwien's Avatar
    26th May 2020, 23:30
    https://www.phoronix.com/scan.php?page=news_item&px=Torvalds-Threadripper Yes, but he just wanted more threads.
    1 replies | 93 view(s)
  • skal's Avatar
    26th May 2020, 23:18
    Also: ​ * you forgot to use '-sharp_yuv' option for the webp example (53kb). Otherwise, it would have give you the quite sharper version: (and note that this webp was encoded from the jpeg-q97, not the original PNG). * in the "Computational Complexity", i'm very surprised that JPEG-XL is faster than libjpeg-turbo. Did you forget to mention multi-thread usage?
    14 replies | 585 view(s)
  • skal's Avatar
    26th May 2020, 21:35
    Jon, your "Original PNG image (2.6 MB)"is actually a jpeg (https://res.cloudinary.com/cloudinary-marketing/image/upload/f_jpg,q_97/Web_Assets/blog/high_fidelity.png) when downloaded. did you mean to add 'f_jpg,q_97' in the URL ?
    14 replies | 585 view(s)
  • SolidComp's Avatar
    26th May 2020, 21:19
    SolidComp replied to a thread Brotli in Data Compression
    Wow it shrunk jQuery down to 10 KB! That's impressive. The dictionary is 110 KB, but that's a one-time hit. There were error messages on dictionary creation though. I don't really understand them:
    256 replies | 82160 view(s)
  • Jon Sneyers's Avatar
    26th May 2020, 20:30
    Hi everyone! I wrote a blog post about the current state of JPEG XL and how it compares to other state-of-the-art image codecs. https://cloudinary.com/blog/how_jpeg_xl_compares_to_other_image_codecs
    14 replies | 585 view(s)
  • SolidComp's Avatar
    26th May 2020, 19:31
    "gzip" as such isn't a command line interface to the zlib library. It's just a format, one of three that zlib supports (the other two are raw DEFLATE and a "zlib" format, also DEFLATE-based). GNU gzip is just a specific app that produces gzip files (and maybe others?). I think zlib has a program that you can easily build. It might be called minizip. Someone please correct me if I'm wrong. The 7-Zip gzipper is unrelated to the .7z or LZMA formats. I'm speaking of 7-Zip the app. It can produce .7z, .xz, gzip (.gz), .zip, .bz2, and perhaps more compression formats. Pavlov wrote his own gzipper from scratch, apparently, and it's massively better than any other gzipper, like GNU gzip or libdeflate. I assume it's better than zlib's gzipper as well. I don't understand how he did it. So if you want to compare the state of the art to gzip, it would probably make sense to use the best gzipper. His gzip files are 17% smaller than libdeflate's on text...
    5 replies | 351 view(s)
  • Scope's Avatar
    26th May 2020, 19:15
    Scope replied to a thread JPEG XL vs. AVIF in Data Compression
    How JPEG XL Compares to Other Image Codecs ​https://cloudinary.com/blog/how_jpeg_xl_compares_to_other_image_codecs
    16 replies | 803 view(s)
  • smjohn1's Avatar
    26th May 2020, 19:12
    OK, that makes sense too. So reducing LZ4_DISTANCE_MAX doesn't necessary increases compression speed. That might be a sweet spot in terms of compression speed.
    4 replies | 205 view(s)
  • Cyan's Avatar
    26th May 2020, 18:57
    In fast mode, finding more matches corresponds to effectively skipping more data and searching less, so it tends to be faster indeed.
    4 replies | 205 view(s)
  • smjohn1's Avatar
    26th May 2020, 17:18
    You are right. Checked the code again, and memory use level was indeed 18 instead of 14. So that was the reason, which makes sense. On the other other hand, smaller LZ4_DISTANCE_MAX results in speed decrease ( though slightly ) in compression. Is that because literal processing ( memory copy ) is slower than match processing?
    4 replies | 205 view(s)
  • lz77's Avatar
    26th May 2020, 10:47
    https://habr.com/ru/news/t/503658/ ​Sorry, in Russian.
    1 replies | 93 view(s)
  • Krishty's Avatar
    26th May 2020, 10:23
    While huffmix works great with pngout /r, I had little success using it on combinations of ECT/DeflOpt/defluff. Details here: https://encode.su/threads/3186-Papa%E2%80%99s-Optimizer?p=65106#post65106 I should check whether there is a way to use ECT similar to pngout /r, i.e. whether block splits are stable with different parameters …
    469 replies | 125789 view(s)
  • Krishty's Avatar
    26th May 2020, 09:54
    I did some experiments with huffmix according to this post by SvenBent: https://encode.su/threads/2274-ECT-an-file-optimizer-with-fast-zopfli-like-deflate-compression?p=64959&viewfull=1#post64959 (There is no public build because I haven’t gotten a response from caveman so far regarding the huffmix license.) I tested a few thousand PNGs from my hard drive. Previous optimization used ECT + defluff + DeflOpt; now it uses huffmix on all intermediate results. Some observations: Without pngout, huffmix has only three streams to choose from: ECT output, ECT + DeflOpt, ECT + defluff + DeflOpt. So there is not much gain to expect. Actual gains were seen in about one out of fifty files. These were mostly 1-B gains; one file got 7 B smaller and another 13 B. The larger the file, the larger the gains. The error rate increased significantly: “More than 1024 Deflate blocks detected, this is not handled by this version.” (known error with large PNGs) “Type 0 (uncompressed) block detected, this is not handled by this version.” (known error) On a few files, huffmix terminated without any error message There is a huge increase in complexity: Previously, there was just one pipeline for all DEFLATE-based formats. I have to maintain a separate path for ZIP now, which is supported by ECT/defluff/DeflOpt but not by huffmix. If huffmix exits with error code 0, all is well. Else if huffmix exits with error code 1, parse stdout: If stdout contains “Type 0 (uncompressed) block detected, this is not handled by this version.”, we have a soft error. Just pick the smaller file. Else if stdout contains “More than 1024 Deflate blocks detected, this is not handled by this version.”, we have a soft error. Just pick the smaller file. Else if stderr (not stdout!) contains “is an unknown file type”, we have hit a file that huffmix doesn’t understand. Just pick the smaller file. (This also happens with defluff and archives containing some Unicode characters.) Else we have a hard error like a destroyed file; abort. There is much more file I/O going on, and there seems to be a very rare race condition with Win32’s CopyFile and new processes. Not huffmix’s fault, but something that is now being triggered much more often. All in all, I’m not sure I should keep working on it. It definitely is a great tool, but it comes with so many limitations and rough edges that the few bytes it gains me over my existing pipeline hardly justify the increased complexity.
    80 replies | 20696 view(s)
  • Jyrki Alakuijala's Avatar
    26th May 2020, 09:53
    Jyrki Alakuijala replied to a thread Brotli in Data Compression
    The same custom dictionary that zstd uses can be used for brotli. In my testing half the time Zstds custom dictionary builder wins Brotli's similar tool, half the time the opposite. Surprisingly often it is an even better strategy (for resulting compression density) to take random 10 samples of the data, concatenate them as a custom dictionary rather than trying to be smart about it.
    256 replies | 82160 view(s)
  • Cyan's Avatar
    26th May 2020, 05:22
    Cyan replied to a thread Brotli in Data Compression
    I think 5 samples is the absolute minimum. Sometimes, even that is not enough, when samples are pretty small. But 90K is relatively large, so that should do it (assuming you are adding multiple copies of the same file, adding empty files wouldn't work). Looking at your screenshot, I noticed a wildcard character `*`. I have no idea how shell expansion works on Windows. Chances are, it doesn't. Prefer using the `-r` command to load all files from a directory, this capability is internal to `zstd` so it should be okay even on Windows, since it doesn't depend on any shell capability.
    256 replies | 82160 view(s)
  • Cyan's Avatar
    26th May 2020, 05:14
    I don't get the same result. When testing with LZ4_DISTANCE_MAX == 32767, I get 57548126 for enwik8, aka slightly worse than a distance of 64KB. In order to get the same compressed size as you, I first need to increase the memory usage by quite a bit (from 16 KB to 256 KB), which is the actual reason for the compression ratio improvement (and compression speed decrease). The impact of MAX_DISTANCE is less dramatic than for high compression mode because, by the very nature of the fast mode, it doesn't have much time to search, so most searches will end up testing candidates at rather short distances anyway. But still, reducing max distance should nonetheless, on average, correspond to some loss of ratio, even if a small one.
    4 replies | 205 view(s)
  • Kirr's Avatar
    26th May 2020, 03:24
    Thanks for kind words, SolidComp. I work with tons of biological data, which motivated me to first make a compressor for such data, and then this benchmark. I'll probably add FASTQ data in the future, if time allows. As for text, HTML, CSS and other data, I have no immediate plans for it. There are three main obstacles: 1. Computation capacity. 2. Selecting relevant data. 3. My time needed to work on it. Possibly it will require cooperating with other compression enthusiasts. I'll need to think about it. I'm under the impression that "zlib" is a compression library, and "gzip" is a command line interface to this same library. Since I benchmark command line compression tools, it's the "gzip" that is included, rather than "zlib". However please let me know if there is some alternative command line "zlib gzipper" that I am missing. Igor Pavlov's excellent LZMA algorithm (which powers 7-Zip) is represented by the "xz" compressor in the benchmark. Igor's unfortunate focus on Windows releases allowed "xz" to become standard LZMA implementation on Linux (as far as I understand). You mean this one - https://github.com/ebiggers/libdeflate ? Looks interesting, I'll take a look at it. I noticed this bit in the GitHub readme: "libdeflate itself is a library, but the following command-line programs which use this library are also provided: gzip (or gunzip), a program which mostly behaves like the standard equivalent, except that it does not yet have good streaming support and therefore does not yet support very large files" - Not supporting very large files sounds alarming. Especially without specifying what exactly they mean by "very large". Regarding gzip, don't get me started! Every single biological database shares data in gzipped form, wasting huge disk space and bandwidth. There is a metric ton of research on biological sequence compression, in addition to excellent general-purpose compressors. Yet the field remains stuck with gzip. I want to show that there are good alternatives to gzip, and that there are large benefits in switching. Whether this will have any effect remains to be seen. At least I migrated all my own data to a better format (saving space and increasing access speed).
    5 replies | 351 view(s)
  • smjohn1's Avatar
    26th May 2020, 00:36
    README.md: `LZ4_DISTANCE_MAX` : control the maximum offset that the compressor will allow. Set to 65535 by default, which is the maximum value supported by lz4 format. Reducing maximum distance will reduce opportunities for LZ4 to find matches, hence will produce worse the compression ratio. The above is true for high compression modes, i.e., levels above 3, but the opposite is true for compression levels 1 and 2. Here is a test result using default value ( 65535 ): <TestData> lz4-v1.9.1 -b1 enwik8 1#enwik8 : 100000000 -> 57262281 (1.746), 325.6 MB/s ,2461.0 MB/s and result using a smaller value ( 32767 ): <TestData> lz4-1.9.1-32 -b1 enwik8 1#enwik8 : 100000000 -> 53005796 (1.887), 239.3 MB/s ,2301.1 MB/s Anything unusual in LZ4_compress_generic() implementation? Could anyone shed some light? Thanks in advance.
    4 replies | 205 view(s)
  • Shelwien's Avatar
    25th May 2020, 23:14
    Shelwien replied to a thread Paq8pxd dict in Data Compression
    I don't think so - there're bugfixes and various tweaks (mostly jpeg model), according to changelog. All changes should be included in v89. If you need something to test, why not test different ppmd parameters? https://github.com/kaitz/paq8pxd/blob/master/paq8pxd.cpp#L12013 These numbers there (12,6,210,64 etc) are somewhat random, so you can try increasing or decreasing them and check how it affects compression. (12 and 6 are PPM orders and 210,64 are memory allocation per ppmd instance).
    921 replies | 314115 view(s)
  • Darek's Avatar
    25th May 2020, 22:42
    Darek replied to a thread Paq8pxd dict in Data Compression
    Are there any changes worth to test v87 and v88?
    921 replies | 314115 view(s)
  • Shelwien's Avatar
    25th May 2020, 21:09
    > "shooting" hundreds of compressors a day That won't really work with gigabyte-sized datasets. At slowest allowed speeds it would take more than a hour to compress it. Number of attempts would be limited simply because of limited computing power (like 5 or so).
    12 replies | 723 view(s)
  • schnaader's Avatar
    25th May 2020, 20:10
    Another question that comes to my mind regarding a private dataset: will there be automation involved to get results quick? Because with a private dataset I imagine literally "shooting" hundreds of compressors a day using different dictionaries to analyze the data. So would this be a valid and working strategy? Alex' quote "organizers will provide some samples" points into the direction to reduce this a bit so you can also do offline using, but it would still be useful.
    12 replies | 723 view(s)
  • SolidComp's Avatar
    25th May 2020, 17:39
    Hi all – @Kirr made an incredibly powerful compression benchmark website called the Sequence Compression Benchmark. It lets you select a bunch of options and run it yourself, with outputs including graphs, column charts, and tables. It can run every single level of every compressor. The only limitation I see at this point is the lack of text datasets – it's mostly genetic data. @Kirr, four things: Broaden it to include text? Would that require a name change or ruin your vision for it? It would be great to see web-based text, like the HTML, CSS, and JS files of the 100 most popular websites for example. The gzipper you currently use is the GNU gzip utility program that comes with most Linux distributions. If you add some text datasets, especially web-derived ones, the zlib gzipper will make more sense than the GNU utility. That's the gzipper used by virtually all web servers. In my limited testing the 7-Zip gzipper is crazy good, so good that it approaches Zstd and brotli levels. It's long been known to be better than GNU gzip and zlib, but I didn't know it approached Zstd and brotli. It comes with the 7-Zip Windows utility released by Igor Pavlov. You might want to include it. libdeflate is worth a look. It's another gzipper. The overarching message here is that gzip ≠ gzip. There are many implementations, and the GNU gzip utility is likely among the worst.
    5 replies | 351 view(s)
  • SolidComp's Avatar
    25th May 2020, 17:20
    SolidComp replied to a thread Brotli in Data Compression
    Five times or five files? I added a fifth file, same error. Screenshot below:
    256 replies | 82160 view(s)
  • Shelwien's Avatar
    25th May 2020, 16:05
    Shelwien replied to a thread Brotli in Data Compression
    @SolidComp: Sorry, I left a mistake after renaming samples subdirectory :) After running gen.bat, the dictionary is in the file named "dictionary". If you're on linux, you can just repeat the operations in gen.bat manually, zstd --train produces the dictionary, zstd -D compresses using it. Then there's also this option to control dictionary size: --maxdict=# : limit dictionary to specified size (default: 112640)
    256 replies | 82160 view(s)
  • Cyan's Avatar
    25th May 2020, 15:25
    Cyan replied to a thread Zstandard in Data Compression
    --patch-from is a new capability designed to reduce the size of transmitted data when updating a file from one version to another. In this model, it is assumed that : - the old version is present at destination site - new and old versions are relatively similar, with only a handful of changes. If that's the case, the compression ratio will be ridiculously good. zstd will see the old version as a "dictionary" when generating the patch and when decompressing the new version. So it's not a new format : the patch is a regular zstd compressed file.
    432 replies | 130538 view(s)
  • Cyan's Avatar
    25th May 2020, 15:15
    Cyan replied to a thread Brotli in Data Compression
    You could try it 5 times. Assuming that the source file is ~90K, this should force the trainer to provide a dictionary from this material. Note though that the produced dictionary will be highly tuned for this specific file, which is not the target model. In production environment, we tend to use ~10K samples, randomly extracted from an even larger pool, in order to generate a dictionary for a category of documents.
    256 replies | 82160 view(s)
  • Jyrki Alakuijala's Avatar
    25th May 2020, 10:17
    Jyrki Alakuijala replied to a thread Brotli in Data Compression
    Based on a quick look at the makefiles, we are not using the fast math option. However, there can be more uncertainty, like perhaps using multiply-and-add as a single instruction leading to a different result that doing multiply and add as two separate instructions. (I'm a bit out of touch with this field. Compilers, vectorization and new instructions are improved constantly.)
    256 replies | 82160 view(s)
  • Kirr's Avatar
    25th May 2020, 05:58
    Kirr replied to a thread Zstandard in Data Compression
    From source, when possible. Thanks, will clarify it on website (in the next update).
    432 replies | 130538 view(s)
  • SolidComp's Avatar
    25th May 2020, 05:16
    SolidComp replied to a thread Zstandard in Data Compression
    Do you build the compressors from source, or do you use the builds provided by the projects?
    432 replies | 130538 view(s)
  • SolidComp's Avatar
    25th May 2020, 05:11
    SolidComp replied to a thread Brotli in Data Compression
    Do you specify fast math in the makefile or cmake?
    256 replies | 82160 view(s)
  • SolidComp's Avatar
    25th May 2020, 05:11
    SolidComp replied to a thread Brotli in Data Compression
    Where's the dictionary?
    256 replies | 82160 view(s)
  • Kirr's Avatar
    25th May 2020, 02:59
    Kirr replied to a thread Zstandard in Data Compression
    zstd is now updated to 1.4.5 in my benchmark: http://kirr.dyndns.org/sequence-compression-benchmark/ I noticed good improvement in decompression speed for all levels, and some improvement in compression speed for slower levels. (Though I am updating from 1.4.0, so the improvement may be larger than from 1.4.4).
    432 replies | 130538 view(s)
  • redrabbit's Avatar
    25th May 2020, 02:09
    Thanks for the explanation and the testing
    84 replies | 13256 view(s)
  • terrelln's Avatar
    25th May 2020, 01:55
    terrelln replied to a thread Zstandard in Data Compression
    Both single-thread and multi-thread modes are deterministic, but they produce different results. Multi-threaded compression produces the same output with any number of threads. The zstd cli defaults to multi-threaded compression with 1 worker thread. You can opt into single-thread compression with --single-thread.
    432 replies | 130538 view(s)
  • schnaader's Avatar
    24th May 2020, 23:47
    OK, so here's the long answer. I could reproduce the bad performance of the Life is Strange 2 testfile, my results are in the table below. There are two things this all boils down to: preflate (vs. zlib brute force in xtool and Precomp 0.4.6) and multithreading. Note that both the Life is Strange 2 times and the decompressed size are very similar for Precomp 0.4.6 and xtool when considering the multithreading factor (computed by using the time command and dividing "user" time by "real" time). Also note that the testfile has many small streams (64 KB decompressed each), preflate doesn't seem to use its multithreading in that case. Although preflate can be slower than zlib brute force, it also has big advantages which can be seen when looking at the Eternal Castle testfile. It consists of big PNG files, preflate can make use of multithreading (though not fully utilizing all cores) and is faster than the zlib brute force. And the zlib brute force doesn't even manage to recompress any of the PNG files. Xtool's (using reflate) decompressed size is somewhere between those two, most likely because reflate doesn't parse multi PNG files and can only decompress parts of them because of this. So, enough explanation, how can the problem be solved? Multithreading, of course. The current branch already features multithreading for JPEG when using -r and I'm working on it for deflate streams. When it's done, I'll post fresh results for the Life is Strange 2 testfile, should be very close to xtool if things work out well. Multithreaded -cn or -cl though is a bit more complex, I've got some ideas, but have to test them and it will take longer. Test system: Hetzner vServer CPX21: AMD Epyc, 3 cores @ 2.5 GHz, Ubuntu 20.04 64-Bit Eternal Castle testfile, 223,699,564 bytes program decompressed size time (decompression/recompression) multithreading factor (decompression/recompression) compressed size (-nl) --- Precomp 0.4.8dev -cn -d0 5,179,454,907 5 min 31 s / 4 min 45 s 1.73 / 1.64 118,917,128 Precomp 0.4.6 -cn -d0 223,699,589 8 min 31 s 1.00 173,364,804 xtool (redrabbit's result) 2,099,419,005 Life is Strange 2 testfile, 632,785,771 bytes program decompressed size time (decompression/recompression) multithreading factor (decompression/recompression) --- Precomp 0.4.8dev -cn -intense0 -d0 1,499,226,364 3 min 21 s / 2 min 14 s 0.91 / 0.99 Precomp 0.4.8dev (after tempfile fix) 1,499,226,364 3 min 11 s / 2 min 21 s 0.92 / 0.99 Precomp 0.4.6 -cn -intense0 -d0 1,497,904,244 1 min 55 s / 1 min 43 s 0.93 / 0.98 xtool 0.9 e:precomp:32mb,t4:zlib (Wine) 1,497,672,882 46 s / 36 s 2.75 / 2.87
    84 replies | 13256 view(s)
More Activity