Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • Darek's Avatar
    Today, 16:58
    Darek replied to a thread Paq8sk in Data Compression
    Could you post source code in every version?
    91 replies | 8249 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Today, 15:17
    Paq8sk23 - improve text model - faster than paq8sk22 ​
    91 replies | 8249 view(s)
  • Dresdenboy's Avatar
    Today, 13:42
    As you're mentioning Crinkler: I had some interesting discussion with Ferris, who created Squishy. He uses a small decompressor to decompress the actual decompressor. He said, the smaller one is about 200B. Then there is xlink, where unlord planned to have a talk at Revision Online 2020, but which has been cancelled. But there seems to be some progress, which he didn't publish yet. This might also be interesting to watch. BTW my smallest decompression loop (for small data sizes and only match length of 2) is 12B. Making it more generic for offsets "blows" it up to 17B. Typical LZ with multiple lengths is starting at ~20B depending on variant and assumptions. There likely are similarities to Stefan Atev's lost one. But I have to continue testing all those variants (with their respective encoders) to publish more about them. Another idea was to encode some x86 specific prefix. Such a prefix emitter can be as small as 15B.
    10 replies | 1028 view(s)
  • Dresdenboy's Avatar
    Today, 02:21
    No problem. Accumulating this information sounds useful. I also collected a lot of information and did some analysis both of encoding ideas for small executables and existing compressors. Recently I stumbled over ZX7 and related to it: Saukav. The latter one is cool as it creates a decompressor based on the actual compression variant and parameters. Before finding it I already deemed this a necessity to keep sizes small. Especially for tiny intros, where a coder could omit some of the generic compressor code (code mover, decompression to original address) to save even more bytes (aside from coding compression-friendly). Here is an example with some of the tested compression algorithms (sizes w/o decompressor stubs and other data blocks, e.g. in the PAQ archive file format), leaving out all samples with less than 5% reduction, as they might be compressed already: Also interesting would be the total size incl. decompressor (not done yet). In this case we might just see different starting offsets (decompressor stub, tables etc.) on Y axis and different gradients with increasing X.
    10 replies | 1028 view(s)
  • introspec's Avatar
    Yesterday, 23:54
    Yes, thank you. I should have mentioned that when I gave my estimated tiny compressor sizes, I had a quick look in several places and definitely used Baudsurfer's page for reference. Unfortunately, his collection of routines is not very systematic (in the sense that I know better examples for at least some CPUs, e.g. Z80), so I am hoping that a bit more representative collection of examples can be gradually accumulated here.
    10 replies | 1028 view(s)
  • Shelwien's Avatar
    Yesterday, 21:41
    I would also like to see some other limitations of the contest: > I read that there would be a speed limit, but what about a RAM limit. I guess there would be a natural one - test machine obviously won't have infinite memory. > There are fast NN compressors, like MCM, or LPAQ. Yes, these would be acceptable, just not full PAQ or CMIX. > It's hard to fight LZ algorithms like RAZOR so I wouldn't try going in that direction. Well, RZ is a ROLZ/LZ77/Delta hybrid. Its still easy enough to achieve better compression via CM/PPM/BWT (and encoding speed too). Or much faster decoding with worse compression. > Are AVX and other instruction sets allowed? Yes, but likely not AVX512, since its hard to find a test machine for it. > What would be nice is some default preprocessing. > If it's an english benchmark, why shouldn't .drt preprocessing (like the one from cmix) > be available by choice (or .wrt + english.dic like the one from pax8pxd). I proposed that, but this approach has a recompression exploit - somebody could undo our preprocessing, then apply something better. So we'd try to explain that preprocessing is expected and post links to some open-source WRT implementations, but the data won't be preprocessed by default. > It would save some time for the developers not to incorporate them into their compressors, > if there were a time limit for the contest. It should run for a few months, so there should be enough time. There're plenty of ways to make a better preprocessor, WRT is not the only option (eg. NNCP preprocess outputs 16-bit alphabet), so its not a good idea to block that and/or force somebody to work on WRT reverse-engineering.
    15 replies | 920 view(s)
  • Jarek's Avatar
    Yesterday, 18:19
    Jarek replied to a thread Kraken compressor in Data Compression
    Road to PS5: https://youtu.be/ph8LyNIT9sg?t=1020 custom kraken >5GB/s decompressor ...
    45 replies | 23998 view(s)
  • Dresdenboy's Avatar
    Yesterday, 17:57
    Thanks for opening this thread. I'm working on my own tiny decompression experiments. And for starters let me point you to Baudsurfer's (Olivier Poudade) assembly art section on his Assembly Language Page: http://olivier.poudade.free.fr/ (site seems a bit buggy sometimes), which has several tiny compressors and decompressors for different platforms.
    10 replies | 1028 view(s)
  • Darek's Avatar
    Yesterday, 14:33
    Darek replied to a thread Paq8sk in Data Compression
    I will. At least I'll try :) I need 2-3 days to finish task which is in progress and then I'll start paq8sk19. paq8sk22 looks for me like move in not good direction - very slightly improvrment affected double of compression time.
    91 replies | 8249 view(s)
  • Darek's Avatar
    Yesterday, 14:29
    Darek replied to a thread Paq8sk in Data Compression
    @Sportman - it's dramatic change in compression time - does this version use much more memory than previous?
    91 replies | 8249 view(s)
  • AlexDoro's Avatar
    Yesterday, 09:39
    I would vote for private. I would also like to see some other limitations of the contest: I read that there would be a speed limit, but what about a RAM limit. There are fast NN compressors, like MCM, or LPAQ. I mean they could be a starting point for some experimental fast compressors. It's hard to fight LZ algorithms like RAZOR so I wouldn't try going in that direction. Are AVX and other instruction sets allowed? What would be nice is some default preprocessing. If it's an english benchmark, why shouldn't .drt preprocessing (like the one from cmix) be available by choice (or .wrt + english.dic like the one from pax8pxd). It would save some time for the developers not to incorporate them into their compressors, if there were a time limit for the contest.
    15 replies | 920 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 06:21
    How about paq8sk22 -x15 -w -e1,english.dic for enwik8
    91 replies | 8249 view(s)
  • Trench's Avatar
    Yesterday, 05:04
    1 What is a programmer? A translator from one language to another. What is a designer? A person that creates. So what is a programmer that tries to create a better file compression like? =A translator that want to change profession to be the next best selling author like Steven King. 2 What you probably should not say when you failed to pack a file? =Fudge, I failed to pack it 3 watch out for your phrases if you ask another if they can squeeze your dongle. 4 A programmer was asked what are you doing and they say concentrating on how to un concentrate something. The othee say easy have a few beers. 5 So the drunk programmer went and bought un-concentrated orange juice to sat on the carton, and a person asks why are they sitting on it. They say to concentrate it obviously. 6 When you ask another to compress lemon juice for you and then wonder if it can be decompressed then maybe its time to take a break. 7 Don't be impressed if someone compresses a file for you 99% the file size since they will say it cant be decompressed since you didnt also ask for that too. 8 A judge told your lawyer to zip it and you misunderstand and says no RAR and you are filed under contempt for growling at the judge. 9 A programmer says he was looking for new ways of packing files all day and another says you must be tiered from lifting so many files. 10 You friend wants forgiveness from you and send you a gift in a 7 zip file. You un-compress the file, and their is another compressed file and after 7th file its still compressed with a new name saying Matthew 18:21-22. Can you guess how many files you have left to un-compress?
    7 replies | 1620 view(s)
  • Amsal's Avatar
    Yesterday, 03:53
    Well I can't vote but I would go with private dataset option. Few of the reasons why I prefer Option 2 over Option 1 are: 1. The resulting compressor/algorithm have more general use in practical ways than a compressor which is optimized for a specific file/dataset which is pretty useless most of the time if you see. 2. Allowing the use of dictionary is also a great add on in the contest. 3. I have no problem(and I suppose most of people won't have) if a algorithm/compressor is using 10 methods(precomp+srep+lzma etc..) or just modifying 1 method (like lzma) to produce better results on multiple datasets until its getting the results which could be used as a better option in practical ways on multiple datasets. I totally agree with these three points by you as well and it would be very great to have a contest like this. And according to me, I wouldn't even care for 16MB compressor if it really saves more size than any other compressor when I compress a 50GB dataset to something like 10GB while other compressors are around 12GB, so a 16mb compressor is a mere small size to account for but anyways its a competition so we take account of everything so fine by me :D
    15 replies | 920 view(s)
  • Trench's Avatar
    Yesterday, 02:53
    AMD was for years mostly better based off price/performance. It is like why pay a billion for something that gives 1% better gain while the other is a dollar for 1% less than the others. You can buy more AMD cpu to out perform Intel. Its just that Intel has better marketing. AMD just does not get it since they are bad at presentation and i dont even understand their order and always reference Intel as a benchmark for performance. Big game companies got it and used AMD for a while now. So in short a million dollars of AMD cpu can beat a million dollars worth of Intel CPU. But to be fair 15 years is a long time. Also linux still sucks which will remain at 2% popularity since they just don't get it, and can not give it away for free no matter how many types they have. Its mainly a hobby os and not fit to use for the people which makes it pointless despite more powerful. Android is better since designers took over. Which goes to show never hire a translator to write novels, just like never hire a programmer as a designer. Progress is slowed down since one profession insist on doing other professionals work.
    2 replies | 120 view(s)
  • Sportman's Avatar
    Yesterday, 01:47
    Sportman replied to a thread Paq8sk in Data Compression
    enwik8: 15,755,063 bytes, 14,222.427 sec., paq8sk22 -x15 -w 15,620,894 bytes, 14,940.285 sec., paq8sk22 -x15 -w -e1,english.dic
    91 replies | 8249 view(s)
  • suryakandau@yahoo.co.id's Avatar
    29th May 2020, 19:54
    The result using paq8sk22 -s6 -w -e1,English.dic on Dickens file is 1900420 bytes
    91 replies | 8249 view(s)
  • Cyan's Avatar
    29th May 2020, 19:41
    Cyan replied to a thread Zstandard in Data Compression
    This all depends on storage strategy. Dictionary is primarily useful when there are tons of small files. But if the log lines are just appended into a single file, as is often the case, then just compress the file normally, it will likely compress very well.
    435 replies | 130870 view(s)
  • Jon Sneyers's Avatar
    29th May 2020, 18:34
    Yes, that would work. Then again, if you do such non-standard stuff, you can just as well make JPEG support alpha transparency by using 4-component JPEGs with some marker that says that the fourth component is alpha (you could probably encode it in such a way that decoders that don't know about the marker relatively gracefully degrade by interpreting the image as a CMYK image that looks the same as the desired RGBA image except it is blended to a black background). Or you could revive arithmetic coding and 12-bit support, which are in the JPEG spec but just not well supported. I guess the point is that we're stuck with legacy JPEG decoders, and they can't do parallel decode. And we're stuck with legacy JPEG files, which don't have a jump table. And even if we would re-encode them with restart markers and jump tables, it would only give parallel striped decode, not efficient cropped decode.
    15 replies | 727 view(s)
  • suryakandau@yahoo.co.id's Avatar
    29th May 2020, 15:37
    Paq8sk22 - improve text model ​
    91 replies | 8249 view(s)
  • pklat's Avatar
    29th May 2020, 14:58
    pklat replied to a thread Zstandard in Data Compression
    what would be best way to create dictionary for log file, such as these from spamassasin: Oct 19 03:42:59 localhost spamd: spamd: connection from blabla.bla.com :61395 to port 1783, fd 5 Oct 19 03:42:59 localhost spamd: spamd: checking message <0OY10MRFLRL00@blablafake.com> for (unknown):101 Oct 19 03:43:00 localhost spamd: spamd: clean message (3.3/8.0) for (unknown):101 in 2.0 seconds, 8848 bytes. Oct 19 03:43:00 localhost spamd: spamd: result: . 3 - DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,MIME_HTML_ONLY,MISSING_FROM,RDNS_NONE scantime=2.0,size=8848,user=(unknown),uid=101,required_score=8.0,rhost=blablafake.com,raddr=ip.add.re.ss,rport=45995,mid=<b9a461d565a@blabla.com>,autolearn=no autolearn_force=no how to manually create dictionary?
    435 replies | 130870 view(s)
  • Bulat Ziganshin's Avatar
    29th May 2020, 12:40
    Bulat Ziganshin replied to a thread Brotli in Data Compression
    AFAIK zstd "dictionary" is just prepended data for LZ matches. This approach can be used with any LZ compressor. While brotli dictionary is a list of byte sequences, plus 6 (?) transformations that can be applied to these byte sequences before inserting them into the stream. Yiu can prepend data for LZ matches with brotli too.
    257 replies | 82366 view(s)
  • Bulat Ziganshin's Avatar
    29th May 2020, 12:34
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    we may need a separate topic, but my little insight is the following: in image compression, we have 2D model and try to predict each pixel using data from left and above. in video, we even have 3rd dimension (previous frame) but general compression usually limited to 1D, although repeated distances and literal masking added a tiny bit of 2nd dimension to the LZ data model patching is natural 2D model - rather than considering it as "ORIGINAL MODIFIED", you should look at it as ORIGINAL MODIFIED this changes the model for LZ back references - we should keep "current pointer" in the ORIGINAL data and try to encode each reference relative to this pointer. It will reduce enocded reference size and thus allow to reference smaller strings from the ORIGINAL data. Also, we can use masked literals, i.e. use "corresponding byte" as the context for encoding the current one Knowledge that we are patching also should allow faster match search. Each time the previous match ends, we have 1) current byte in the MODIFIED data 2) "current byte" in the ORIGINAL data 3) last actually used byte in the ORIGINAL data So, we suppose that the next match may have srcpos near 2 or 3 and dstpos at 1 or a bit later. So we may look around for smaller matches (2-3 bytes) before going to full-scale search
    435 replies | 130870 view(s)
  • pklat's Avatar
    29th May 2020, 11:28
    ok Jyrki, will do. ​forgot to mention I don't have AVX CPU that was required before if that matters.
    155 replies | 36608 view(s)
  • Jyrki Alakuijala's Avatar
    29th May 2020, 10:27
    Jyrki Alakuijala replied to a thread Brotli in Data Compression
    This is likely a misunderstanding. Brotli can use the same linear dictionaries used in zstd and the same tooling. The dictionary mechanisms with simple grammar is in addition to that, but ordinary linear dictionaries can be used. One just gets a bit less benefit from them (but not less than zstd gets from these simple dictionaries). Zstd does not yet support transforms on dictionaries as far as I know.
    257 replies | 82366 view(s)
  • Jyrki Alakuijala's Avatar
    29th May 2020, 10:13
    Could you file an issue either on jpegxl GitLab or on brunsli GitHub repo and we will look at it, and make them not differ. We have run this with lots of filed successfully so I doubt that this is either a special corner case or more likely a recent bug. We are currently converting these compressors into more streaming operation and to more easily streamable apis, and this bug might have come from that effort. Thank you in advance!
    155 replies | 36608 view(s)
  • Shelwien's Avatar
    29th May 2020, 06:38
    Adding decompressor size requires absurd data sizes to avoid exploits (for 1GB dataset, compressed zstd size is still ~0.1% of total result) Otherwise the contest can turn into decoder size optimization contest, if intermediate 1st place is open-source. Also Alex pushes for a mixed dataset (part public, part private, with uncertain shares), but I think that it just combines negatives of both options (overtuning still possible on public part, decoder size still necessary to avoid exploits, compressed size of secret part still not 100% predictable in advance).
    15 replies | 920 view(s)
  • SvenBent's Avatar
    29th May 2020, 04:49
    I cant vote but i would vote private/secret The public dataset Encourage over tuning which is not really helpfull or a show of genreal compression rate. in the realworld the compressor does not know the data ahead of compression time. I would still add size+ de compressor though
    15 replies | 920 view(s)
  • Cyan's Avatar
    29th May 2020, 04:09
    Cyan replied to a thread Zstandard in Data Compression
    So far, we have only thoroughly compared with bsdiff We can certainly extend the comparison to more products, to get a more complete picture. MT support for --patch-from works just fine. In term of positioning, zstd is trying to bring speed to formula : fast generation of patches, fast application of patches. There are use cases which need speed and will like this trade-off, compared to more established solutions which tend to be less flexible in term of range of speed. At this stage, we don't try to claim "best" patch size. There are a few scenarios where zstd can be quite competitive, but that's not always the case. This first release will hopefully help us understand what are users's expectations, in order to select the next batch of improvements. This is a new territory for us, there is still plenty of room for improvements, both feature and performance wise. One unclear aspect to me is how much benefit could achieve a dedicated diff engine (as opposed to recycling our "normal" search engine) while preserving the zstd format. There are, most likely, some limitations introduced by the format, since it wasn't created with this purpose in mind. But how much comes from the format, as opposed to the engine ? This part is unclear to me. Currently, I suspect that the most important limitations come from the engine, hence better patch sizes should be possible.
    435 replies | 130870 view(s)
  • Shelwien's Avatar
    29th May 2020, 02:42
    Shelwien replied to a thread Zstandard in Data Compression
    I asked FitGirl to test it... got this: 1056507088 d2_game2_003.00 // (1) game data 1383948734 d2_game2_003.resources // (2) precomp output 327523769 d2_game2_003.resources.x5 // xdelta -5 245798553 d2_game2_003.resources.x5.8 // compressed 278021923 d2_game2_003.resources.zsp // zstd -patch 247363158 d2_game2_003.resources.zsp.8 // compressed Speed-wise zstd patching seems good, but it has a 2G window limit, MT support for this is unknown, and overall specialized patchers seem to work better.
    435 replies | 130870 view(s)
  • Bulat Ziganshin's Avatar
    29th May 2020, 01:53
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    Cyan, do you compared it against xdelta and similar algos?
    435 replies | 130870 view(s)
  • skal's Avatar
    28th May 2020, 15:36
    The trick is to put the index ("jump table") in the COM section reserved for comments. Mildly non-standard JPEGs, but workable.
    15 replies | 727 view(s)
  • Kirr's Avatar
    28th May 2020, 13:49
    Thanks James, looks like I'll have to add libdeflate soon. I'm still worrying about it's gzip replacement not supporting large files. I guess I'll see what they mean.
    5 replies | 377 view(s)
  • pklat's Avatar
    28th May 2020, 13:12
    I've tried lossless jpeg repack again with update jpegxl and some other image. looks normal now, it didn't skew it, but its not rotated so presumably something is lost (metadata?). is there some option for (d)encoder to unpack to original besides '-j'? I've tried zero-padding the start of unpacked file and comparing it to original, but couldn't find any match. also converted both to .bmp but they differ.
    155 replies | 36608 view(s)
  • Jyrki Alakuijala's Avatar
    28th May 2020, 12:45
    There are two separate optimization goals: S1. what is best for users, and S2. what is easiest for webmaster (or other deployer of compression) to run business as usual. For S1. we need to look at the whole information system and its efficiency and cost factors. Often we can find out by using simple economic modeling that the cost of having user time and attention is 1000x more value than the cost of computer. This is understandable in the light of computers and the energy consumed being a value of $1000 and a human having a value that trumps $1000000. Users can be blocked by two different things related to compression: data transfer payload size and excessive coding resource use, decoding speed, and rarely encoding, too. Current compression methods in general are not spending enough CPU and memory in order to fully optimize for S1. That is because people also optimize for the S2. For S2. People often consider compression separately, outside of the larger information processing system. Then they run a profiler, and see % values. Many engineers are willing to save 20 % of cpu speed while losing 5 % of density. The %-to-% comparison seems superficially like oranges-to-oranges comparison. They may however lack the knowledge that only the density modulates the user experienced speed, and that that cost is 1000x more than the cpu cost. They are able to transfer cost from their company to their clients. Also, these engineers may not have access even to data transfer costs, so that they could even do a pure cost based optimization with disregard to users. However, ignoring the users will cost the company revenue and growth. Saving a few thousand in compression cpu use can lead to a double digit revenue drop for a big e-commerce company. I often see that junior engineers are very keen on the profile-based optimization and try to switch to faster compression algorithms that are only a few percent worse, and the more senior engineers with a holistic system point of view stop them -- or ask them to run an experiment to show that there is no negative effect on conversions/revenue/returning users etc. For naive positioning based on S2, we will see many webservers configured with no compression or the ineffective gzip quality 1 compression.
    16 replies | 856 view(s)
  • Jon Sneyers's Avatar
    28th May 2020, 11:59
    I wouldn't say sequential codecs are more efficient than parallel ones: you can always just use only a single thread, and avoid the of course unavoidably imperfect parallel scaling. If you have enough images to process at the same time (like Cloudinary, or maybe rendering a website with lots of similar-sized images), you can indeed best just parallelize that way and use a single thread per image. There are still cases where you don't have enough images to process in parallel single-thread processes to keep your cores busy though. For end-users, I think the "Photoshop case" is probably a rather common case. Restart markers in JPEG only allow you to do parallel encode, not parallel decode. A decoder doesn't know if and where the next restart marker occurs, and what part of the image data it represents. You can also only do stripes with restart markers, not tiles. So even if you'd add some custom metadata to make an index of restart marker bitstream/image offsets, it would only help to do full-image parallel decode, not cropped decode (e.g. decoding just a 1000x1000 region from a gigapixel image). I don't think the fact that no-one is trying to do this is telling. Applications that need efficient parallel/cropped decode (e.g. medical imaging) just don't use JPEG, but e.g. JPEG 2000. Multiplying the JPEG numbers by 4 doesn't make much sense, because you can't actually decode a JPEG 4x faster on 4 cores than one 1 core. Dividing the JPEG XL numbers by 3 (for decode) and by 2 (for encode) is what you need to do to get "fair" numbers: that's the speed you would get on single-core (the factor is not 4 because parallel scalability is never perfect). There's a reason why all the HEIC files produced by Apple devices are using completely independently encoded 256x256 tiles. Otherwise encode and decode would probably be too slow. The internal grid boundary artifacts are a problem in this approach though.
    15 replies | 727 view(s)
  • Shelwien's Avatar
    28th May 2020, 02:59
    Shelwien replied to a thread Paq8pxd dict in Data Compression
    Works like this for me. You only need to compile one cpp file - pxd.cpp itself. It already includes everything else, except for zlib.
    921 replies | 314748 view(s)
  • suryakandau@yahoo.co.id's Avatar
    28th May 2020, 01:10
    Inside paq8sk19 archive there is g.bat to compile it ,just rename the cpp file https://encode.su/threads/3371-Paq8sk/page3
    921 replies | 314748 view(s)
  • LucaBiondi's Avatar
    28th May 2020, 01:00
    LucaBiondi replied to a thread Paq8pxd dict in Data Compression
    Hi all, I am trying to compile latest sources of PAQ8PXD. I am using the script below that i used to compile PA8PX but i get errors.. I am not good with c++ Could some help me? Could some give me a script ? Thank you Luca This is the script: inside zlist i put: The error are... Thank you as usual! Luca
    921 replies | 314748 view(s)
  • LucaBiondi's Avatar
    27th May 2020, 23:50
    LucaBiondi replied to a thread Paq8pxd dict in Data Compression
    Thank you! Probably could be useful store parameters into a text file (so ...no file and you will have standard behaviour) or again better have the possibilities to set them from the command line. Doing so we can avoid to recompile the sources every time. Luca
    921 replies | 314748 view(s)
  • Shelwien's Avatar
    27th May 2020, 21:24
    Shelwien replied to a thread Paq8pxd dict in Data Compression
    > Do you know if there is a memory limitation allocating more than 1680 MB or 2000 MB? Afaik it should support up to 4095MB on x64. It uses 32-bit pointers for the tree, so no more than 4GB certainly, but up to that it should work. There's also a third parameter which controls whether ppmd resets statistics when its memory ends 0 = full reset, 1 = tree reduction which leaves 75% of stats. In fact this is also a tunable parameter: https://github.com/kaitz/paq8pxd/blob/master/mod_ppmd.inc#L727 (3*(M>>2))=3*M/4=75%. You can edit it to "GetUsedMemory()>96*(SubAllocatorSize>>7)", then try adjusting the coef (1..127 instead of 96 here).
    921 replies | 314748 view(s)
  • JamesB's Avatar
    27th May 2020, 17:51
    I think Libdeflate is the fastest tool out there right now, unless limiting to light-weight "level 1" style in which case maybe libslz wins out. We integrated libdeflate support into Samtools; for (de)compression of sequencing alignment data in the BAM format. I suspect this is the cause of libdeflate becoming an official Ubuntu package as Samtools/htslib have it as a dependency. I recently retested several deflater algorithms on enwik8. Tool Encode Decode Size ------------------------------------------ vanilla 0m5.003s 0m0.517s 36548933 intel 0m3.057s 0m0.503s 36951028 cloudflare 0m2.492s 0m0.443s 36511793 jtkukunas 0m2.956s 0m0.357s 36950998 ng 0m2.022s 0m0.377s 36881293 zstd (gz-6) 0m4.674s 0m0.468s 36548933 libdeflate 0m1.769s 0m0.229s 36648336 Note the file sizes fluctuate a bit. That's within the difference between gzip -5 vs -6 so arguably you'd include that in the time difference too. I also tried them at level 1 compression: Tool Encode Decode Size ------------------------------------------ vanilla 0m1.851s 0m0.546s 42298786 intel 0m0.866s 0m0.524s 56046821 cloudflare 0m1.163s 0m0.470s 40867185 jtkukunas 0m1.329s 0m0.392s 40867185 ng 0m0.913s 0m0.397s 56045984 zstd (gz) 0m1.764s 0m0.475s 42298786 libdeflate 0m1.024s 0m0.235s 39597396 Level 1 is curious as you can see very much how different versions have traded off the encoder algorithm speed vs size efficiency, with cloudflare and jtkukunas apparently using the same algorithm and intel/ng likewise. Libdeflate is no longer the fastest here, but it's not far off and is the smallest so it's in a sweet spot. And for fun, level 9: Tool Encode Decode Size ------------------------------------------ vanilla 0m6.113s 0m0.516s 36475804 intel 0m5.153s 0m0.516s 36475794 cloudflare 0m2.787s 0m0.442s 36470203 jtkukunas 0m5.034s 0m0.365s 36475794 ng 0m2.872s 0m0.371s 36470203 zstd (gz) 0m5.702s 0m0.467s 36475804 libdeflate 0m9.124s 0m0.237s 35197159 All remarkably similar sizes, bar libdeflate which took longer but squashed it considerably more. Libdeflate actually goes up to -12, but it's not a good tradeoff on this file: libdeflate 0m14.660s 0m0.236s 35100586 Edit: I tested 7z max too, but it was comparable to libdeflate max and much slower.
    5 replies | 377 view(s)
  • LucaBiondi's Avatar
    27th May 2020, 17:10
    LucaBiondi replied to a thread Paq8pxd dict in Data Compression
    Hi Shelwien, About testing parameters. for ppmd_12_256_1 level 8 order 12 memory 210 MB level 9 order 16 memory 420 MB level 10 order 16 memory 840 MB level 11 order 16 memory 1680 MB level 12 order 16 memory 1680 MB level 13 order 16 memory 1680 MB level 14 order 16 memory 1680 MB level 15 order 16 memory 1680 MB for ppmd_6_64_2 order 6 memory 64 MB (for each level) Do you know if there is a memory limitation allocating more than 1680 MB or 2000 MB? Thank you, Luca
    921 replies | 314748 view(s)
  • cssignet's Avatar
    27th May 2020, 15:24
    oops, sorry then (i removed the wrong statement), i have to admit that i did not check results here :). i just did few trials and meet some situations where some PNG would be smaller stored as lossless with JXL (instead of lossy), thought it was one of them my observations were about web usage context only, and how 16 bits/sample PNG are rendered in web browsers anyway
    15 replies | 727 view(s)
  • skal's Avatar
    27th May 2020, 15:03
    And yet, sequential codecs are more efficient than parallel ones: tile-based compression have sync points and contention that makes the codec wait for threads to finish. Processing several images separately in parallel doesn't have this inefficiency (providing memory and I/O is not the bottleneck). Actually, sequential codecs are at advantage in some quite important cases: * image burst on phone camera (sensors is taking a sequence of photos in short bursts) * web page rendering (which contains a lot of images, usually. Think YouTube landing page.) * displaying photo albums (/thumbnails) * back-end processing of a lot of photos in parallel (cloudinary?) Actually, I'd say parallel codec are mostly useful for the Photoshop case (where you're using one photo only) and screen sharing (/slide deck). side note: jpeg can be made parallelizable using Restart Markers. Fact that no-one is using it is somewhat telling. In any case, i would have multiplied the JPEG's MP/s by 4x in your table to get fair numbers.
    15 replies | 727 view(s)
  • Jon Sneyers's Avatar
    27th May 2020, 14:08
    The thing is, a bitstream needs to be suitable for parallel encode/decode. That is not always the case. Comparing using just 1 thread gives an unfair advantage to inherently sequential codecs. Typical machines have more than 4 cores nowadays. Even in phones, 8 is common. The tendency is towards more cores and not much faster cores. The ability to do parallel encode/decode is important.
    15 replies | 727 view(s)
  • skal's Avatar
    27th May 2020, 13:34
    -sharp_yuv is not default because it's slower: the default are adapted to the general common use, and the image you picked as source is not the common case the defaults are tuned for, far from. (all the more that these images are better compressed losslessly!) Just because you have 4 cores, doesn't mean you want to use them all at once. Especially if you have several images to compress in parallel (which is often the case). For making a point with a fair comparison, it would have been less noise to force 1 thread for all codecs. As presented, i find the text quite misleading.
    15 replies | 727 view(s)
  • Jon Sneyers's Avatar
    27th May 2020, 13:16
    Good point, yes, better results are probably possible for all codecs with custom encoder options. I used default options for all. ​Numbers are for 4 threads, as is mentioned in the blogpost. On single core, libjpeg-turbo will be faster. Using more than four cores, jxl will be more significantly faster. It's hard to find a CPU with fewer than 4 cores these days.
    15 replies | 727 view(s)
  • Jon Sneyers's Avatar
    27th May 2020, 13:07
    Correct. ​The article shows only that crop, but the sizes are for the whole image. Also, lossless WebP wouldn't be completely lossless since this is a 16-bit PNG (quantizing to 8-bit introduces very minor color banding).
    15 replies | 727 view(s)
  • Jon Sneyers's Avatar
    27th May 2020, 12:55
    ​Sorry, yes, drop the f_jpg,q_97 to get the actual original PNG.
    15 replies | 727 view(s)
  • Kirr's Avatar
    27th May 2020, 10:45
    Yeah, the two are a tool and a library implementing DEFLATE algorith, this is more accurate to say. In my benchmark, by "gzip" I refer to the software tool, not to the "gzip" file format. zlib has "zpipe.c" in "examples" directory. This may be what you mean. I guess there is no point testing it, but perhaps I should benchmark it to confirm this. It seems 7-Zip is still Windows-exclusive. However there is a more portable "p7zip" - I will think about adding it to the benchmark.
    5 replies | 377 view(s)
  • suryakandau@yahoo.co.id's Avatar
    27th May 2020, 09:34
    @darek could you test paq8sk19 -x15 -w -e1,english.dic on enwik9 please ? thank you
    91 replies | 8249 view(s)
  • cssignet's Avatar
    27th May 2020, 09:13
    the host (https://i.slow.pics/) did some kind of post-processing on PNG (dropping the iCCP chunk and recompressing the image data worsely). those files are not what i uploaded (see edited link on my first post)
    15 replies | 727 view(s)
  • hunman's Avatar
    27th May 2020, 08:01
    hunman replied to a thread MCM + LZP in Data Compression
    Maybe you can integrate it into Bulat's FreeARC...
    53 replies | 35381 view(s)
  • Li weiqin's Avatar
    27th May 2020, 05:46
    Li weiqin replied to a thread MCM + LZP in Data Compression
    I've used this wonderful function for a year and wonder who made it. And I find this, thank you. But, it's hard to use for normal people like me for it can only run on cmd and compress a file per operation. If somebody can design a GUi or remake a graphic software, it will be great.
    53 replies | 35381 view(s)
  • SolidComp's Avatar
    27th May 2020, 03:25
    Your lossless reduction darkened the image though. Look at them side by side.
    15 replies | 727 view(s)
  • cssignet's Avatar
    27th May 2020, 02:19
    ​i guess the original PNG would be this: https://res.cloudinary.com/cloudinary-marketing/image/upload/Web_Assets/blog/high_fidelity.png some trials with close filesize (webp = no meta, png = meta): cwebp -q 91 high_fidelity.png -o q91.webp (52.81 KB) -> q91.png cwebp -q 90 -sharp_yuv high_fidelity.png -o q90-sharp.webp (52.06 KB) -> q90-sharp.png it would be unrelated with the point of the article itself, but still, since web delivery is mentionned, few points from end-user pov on samples/results: - about PNG itself, the encoder used here would make very over-bloated data for web context, making the initial filesize non-representative of the format (original PNG is 2542.12 KB, but expected rendering for web could be losslessly encoded to 227.08 KB with all chunks). as aside note, this PNG encoder also wrote non-standard key for zTxt/tEXt chunks or non-standard chunks (caNv) btw, instead of math lossless only, did you plan somehow to provide a "web lossless"? i did not try, but feeding the lossless (math) encoder with 16 bits/sample PNG would probably create over-bloated file for web usage
    15 replies | 727 view(s)
  • SvenBent's Avatar
    27th May 2020, 01:34
    Thank you for the testing i ran into some of the same issues with ECT. it appearsps ECT uses a lot higher number of blocks than pngout I reported this issue to caveman in his huffmixthread https://encode.su/threads/1313-Huffmix-a-PNGOUT-r-catalyst?p=65017&viewfull=1#post65017 Personally since Deflopt does never increase size I do not believe its has the biggest effect with huffmix but I can ECT + defltop /b mixed with ECT+defluffed+delftop /b, as defluff sometimes increases size. i wonder what the huffxmi succes rate is from ECT -9 with pngout /f6 /ks /kp /force on the ect file
    469 replies | 125943 view(s)
  • Shelwien's Avatar
    26th May 2020, 23:30
    https://www.phoronix.com/scan.php?page=news_item&px=Torvalds-Threadripper Yes, but he just wanted more threads.
    2 replies | 120 view(s)
  • skal's Avatar
    26th May 2020, 23:18
    Also: ​ * you forgot to use '-sharp_yuv' option for the webp example (53kb). Otherwise, it would have give you the quite sharper version: (and note that this webp was encoded from the jpeg-q97, not the original PNG). * in the "Computational Complexity", i'm very surprised that JPEG-XL is faster than libjpeg-turbo. Did you forget to mention multi-thread usage?
    15 replies | 727 view(s)
  • skal's Avatar
    26th May 2020, 21:35
    Jon, your "Original PNG image (2.6 MB)"is actually a jpeg (https://res.cloudinary.com/cloudinary-marketing/image/upload/f_jpg,q_97/Web_Assets/blog/high_fidelity.png) when downloaded. did you mean to add 'f_jpg,q_97' in the URL ?
    15 replies | 727 view(s)
  • SolidComp's Avatar
    26th May 2020, 21:19
    SolidComp replied to a thread Brotli in Data Compression
    Wow it shrunk jQuery down to 10 KB! That's impressive. The dictionary is 110 KB, but that's a one-time hit. There were error messages on dictionary creation though. I don't really understand them:
    257 replies | 82366 view(s)
  • Jon Sneyers's Avatar
    26th May 2020, 20:30
    Hi everyone! I wrote a blog post about the current state of JPEG XL and how it compares to other state-of-the-art image codecs. https://cloudinary.com/blog/how_jpeg_xl_compares_to_other_image_codecs
    15 replies | 727 view(s)
  • SolidComp's Avatar
    26th May 2020, 19:31
    "gzip" as such isn't a command line interface to the zlib library. It's just a format, one of three that zlib supports (the other two are raw DEFLATE and a "zlib" format, also DEFLATE-based). GNU gzip is just a specific app that produces gzip files (and maybe others?). I think zlib has a program that you can easily build. It might be called minizip. Someone please correct me if I'm wrong. The 7-Zip gzipper is unrelated to the .7z or LZMA formats. I'm speaking of 7-Zip the app. It can produce .7z, .xz, gzip (.gz), .zip, .bz2, and perhaps more compression formats. Pavlov wrote his own gzipper from scratch, apparently, and it's massively better than any other gzipper, like GNU gzip or libdeflate. I assume it's better than zlib's gzipper as well. I don't understand how he did it. So if you want to compare the state of the art to gzip, it would probably make sense to use the best gzipper. His gzip files are 17% smaller than libdeflate's on text...
    5 replies | 377 view(s)
  • Scope's Avatar
    26th May 2020, 19:15
    Scope replied to a thread JPEG XL vs. AVIF in Data Compression
    How JPEG XL Compares to Other Image Codecs ​https://cloudinary.com/blog/how_jpeg_xl_compares_to_other_image_codecs
    16 replies | 856 view(s)
  • smjohn1's Avatar
    26th May 2020, 19:12
    OK, that makes sense too. So reducing LZ4_DISTANCE_MAX doesn't necessary increases compression speed. That might be a sweet spot in terms of compression speed.
    4 replies | 223 view(s)
  • Cyan's Avatar
    26th May 2020, 18:57
    In fast mode, finding more matches corresponds to effectively skipping more data and searching less, so it tends to be faster indeed.
    4 replies | 223 view(s)
  • smjohn1's Avatar
    26th May 2020, 17:18
    You are right. Checked the code again, and memory use level was indeed 18 instead of 14. So that was the reason, which makes sense. On the other other hand, smaller LZ4_DISTANCE_MAX results in speed decrease ( though slightly ) in compression. Is that because literal processing ( memory copy ) is slower than match processing?
    4 replies | 223 view(s)
  • lz77's Avatar
    26th May 2020, 10:47
    https://habr.com/ru/news/t/503658/ ​Sorry, in Russian.
    2 replies | 120 view(s)
  • Krishty's Avatar
    26th May 2020, 10:23
    While huffmix works great with pngout /r, I had little success using it on combinations of ECT/DeflOpt/defluff. Details here: https://encode.su/threads/3186-Papa%E2%80%99s-Optimizer?p=65106#post65106 I should check whether there is a way to use ECT similar to pngout /r, i.e. whether block splits are stable with different parameters …
    469 replies | 125943 view(s)
More Activity