Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • SpyFX's Avatar
    Today, 20:27
    SpyFX replied to a thread zpaq updates in Data Compression
    libzpaq.h.txt ? LIBZPAQ Version 7.00 header - Dec. 15, 2014. libzpaq.cpp ? LIBZPAQ Version 6.52 implementation - May 9, 2014. last version => ​ libzpaq.h - LIBZPAQ Version 7.12 header - Apr. 19, 2016. libzpaq.cpp - LIBZPAQ Version 7.15 implementation - Aug. 17, 2016.
    2611 replies | 1118879 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Today, 20:03
    Paq8sk38 - improve jpeg compression using -8 option on f.jpg (darek corpus) 112038 bytes -> 81110 bytes a10.jpg (maximum compression corpus) 842468 bytes -> 623432 bytes dsc_0001.jpg 3162196 bytes -> 2188750 bytes there is source code and binary file inside the package.
    181 replies | 16150 view(s)
  • fcorbelli's Avatar
    Today, 19:16
    fcorbelli replied to a thread zpaq updates in Data Compression
    This is not so useful, for me, because on clients I use only NVMe/SSD disks. Can you please send me your EXE, or try mine? http://www.francocorbelli.it/zpaqlist.exe zpaqlist l "h:\zarc\copia_zarc.zpaq" -out z:\default.txt zpaqlist l "h:\zarc\copia_zarc.zpaq" -all -out z:\all.txt zpaqlist l "h:\zarc\copia_zarc.zpaq" -until 10 -out z:\10.txt I attach the source, if you want to compile yourself. The output (-all) of 715 is sorted by version, then by file. Mine is sorted by file, then by version (aka: like a Time Machine). To reduce time to write (and to read) from disk I "deduplicate" the filenames to just "?" (invalid filename). Write, and read, a 600MB (typically 715 list output of complex zpaq) file on magnetic disks takes time. Shrinking to 170MB (my test bed) is faster, but not real quick. --- Result: my PAKKA Windows GUI is much faster than anything else I have founded. Of course... only half dozen competitors :D But it doesn't satisfy me anyway
    2611 replies | 1118879 view(s)
  • Sportman's Avatar
    Today, 16:14
    CPU Security Mitigation Performance On Intel Tiger Lake: https://www.phoronix.com/scan.php?page=article&item=tiger-lake-mitigations
    30 replies | 5852 view(s)
  • lz77's Avatar
    Today, 11:07
    Not so surprised because zlib compresses it 7.3 sec. longer than zstd. I'm surprised by pglz's c_time in Test 1, text, Rapid. Here it is their moral: they are ready to do everything for money! ;)
    128 replies | 13100 view(s)
  • SpyFX's Avatar
    Today, 02:02
    SpyFX replied to a thread zpaq updates in Data Compression
    ..\zpaq715.exe l DISK_F_Y_????.zpaq -all > zpaq715.first.txt zpaq v7.15 journaling archiver, compiled Aug 17 2016 DISK_F_Y_????.zpaq: 778 versions, 833382 files, 24245031 fragments, 764858.961314 MB 30027797.206934 MB of 30027797.206934 MB (834160 files) shown -> 2000296.773364 MB (415375470 refs to 24245031 of 24245031 frags) after dedupe -> 764858.961314 MB compressed. 54.032 seconds (all OK) Z:\ZPAQ\backup>..\zpaq715.exe l DISK_F_Y_????.zpaq -all > zpaq715.first.txt 54.032 seconds (all OK) Z:\ZPAQ\backup>..\zpaq715.exe l DISK_F_Y_????.zpaq -all > zpaq715.second.txt 38.453 seconds (all OK) Z:\ZPAQ\backup>..\zpaq715.exe l DISK_F_Y_????.zpaq -all > zpaq715.third.txt 38.812 seconds (all OK) the first launch caches h/i blocks and for this reason the next launches are processed faster for my archive, the time to get a list of files is 38 seconds if all blocks are in the system file cache can you do multiple launches?, you should reset the system file cache before first run
    2611 replies | 1118879 view(s)
  • Gotty's Avatar
    Yesterday, 23:03
    Gotty replied to a thread Paq8sk in Data Compression
    m1 ->set(column >> 3 | min(5 + 2 * static_cast<int>(comp == 0), zu + zv),2048); m1->set( coef | min(7, zu + zv),2048); m1->set(mcuPos,1024); m1->set( coef | min(5 + 2 * static_cast<int>(comp == 0), zu + zv),1024); m1->set(coef | min(3, ilog2(zu + zv)), 1024);
    181 replies | 16150 view(s)
  • fcorbelli's Avatar
    Yesterday, 20:52
    fcorbelli replied to a thread zpaq updates in Data Compression
    You are right, but in my case the archives is big (400+GB) so the overhead is small. Very quick and very dirty test (same hardware, from SSD to Ramdisk, decent CPU) Listing of copia_zarc.zpaq: 1057 versions, 3271902 files, 166512 fragments, 5730.358526 MB 796080.407841 MB of 796080.407841 MB (3272959 files) shown -> 10544.215824 MB (13999029 refs to 166512 of 166512 frags) after dedupe -> 5730.358526 MB compressed. ZPAQ 7.15 64 bit (sorted by version) zpaq64 l h:\zarc\copia_zarc.zpaq -all >z:\715.txt 60.297 seconds, 603.512.182 bytes output zpaqlist 64 bit - franz22 (sorted by file) zpaqlist l "h:\zarc\copia_zarc.zpaq" -all -out z:\22.txt 15.047 seconds, 172.130.287 bytes output Way slower on magnetic disks, painfully slow from magnetic-disk NAS, even with 10Gb ethernet
    2611 replies | 1118879 view(s)
  • SpyFX's Avatar
    Yesterday, 19:03
    SpyFX replied to a thread zpaq updates in Data Compression
    fake file is limited to 64k by zpaq format, I think one such file is not enough ​ it seems to me that placing the file sizes in such a file will significantly speed up the process of getting a list of files in the console, but it will be redundant to burn the entire list of files in each version
    2611 replies | 1118879 view(s)
  • fcorbelli's Avatar
    Yesterday, 18:24
    fcorbelli replied to a thread zpaq updates in Data Compression
    I will think about it. Yes, and a optional ASCII list of all the files and all versions. So, when you "list" an archive, the fake file is uncompressed and send out as output. Another option is ASCII comments in versions, so you can make somethin add ... blablabla -comment "my first version" I work almost with Delphi :D
    2611 replies | 1118879 view(s)
  • SpyFX's Avatar
    Yesterday, 17:39
    SpyFX replied to a thread zpaq updates in Data Compression
    I wrote that in zpaq on c/block there is no size limit, at the moment I decided to store the second usize there, which is equal to the sum of the sizes all d/block + all h/block, this makes it possible to immediately move to the first i/block, it seems to me there is a small acceleration since there are no boundaries, you can store any additional information in c/block I don't like that c/block is not aligned to the 4k border, so I am creating it at the moment in 4k size, so that during the final rewrite, do not touch other data I understand correctly that the fake file is supposed to store information about checksums and filesizes? I like this idea, I also plan to use it for 4K alignment of the first h-i/block in the version and subsequent c/blocks in the archive alignment allow me to simplify the algorithm for reading blocks from the archive without system buffering, because I don't like that when working with a zpaq archive, useful cached data is pushed out of the server's RAM p/s I understand that it is very difficult to make fundamental changes in the zpaq code, so I rewrite all the work with zpaq archive, I use C# and my zpaq api (dll) almost everything has already worked :)) but your ideas are forcing me to change the concept of creating a correct archive that would solve your wishes as well
    2611 replies | 1118879 view(s)
  • Ms1's Avatar
    Yesterday, 14:51
    There are only 5 entries in this category to my regret. It's extremely improbable to change because other people still having issues with libraries aim at slower categories. Thus definitely no lower than 5th. That would be too early for me to make comments on such questions, but aren't you surprised by the zlib result? It's not far from Zstd in terms of compression ratio.
    128 replies | 13100 view(s)
  • fcorbelli's Avatar
    Yesterday, 14:50
    fcorbelli replied to a thread zpaq updates in Data Compression
    Only partially, or at least for me. Two big problems: 1- no space to store anything in version (ASCII comment) 2- no space for anything in blocks (end of segment with 20 bytes SHA1, or nothing). As stated I am thinking about "fake" (date==0==deleted) files to store information (715 ignore deleted one) But it is not so easy, and even worse not so fast. This is typically OK for magnetic disks, not so good for SSDs. But, at least for me, very slow file listing is, currently, the main defect.
    2611 replies | 1118879 view(s)
  • SpyFX's Avatar
    Yesterday, 13:19
    SpyFX replied to a thread zpaq updates in Data Compression
    the zpaq format seems to me quite thoughtful and it is possible to squeeze additional improvements out of it without violating backward compatibility We can use any CDC, maybe even different for different data, but it seems to me that zpaq CDC is not so bad at the moment I'm not satisfied with processing with a large number of small files (hundreds of thousands or more), everything is very slow, as well as processing large files from different physical hdd, everything from the fact that zpaq715 reads all files sequentially
    2611 replies | 1118879 view(s)
  • lz77's Avatar
    Yesterday, 13:13
    I once bought on ebay from a best seller (100% positive feedbacks) pure ceylon OPA high quality tea, and got hay in a box named "Kenton". Then I found this site: theceylontea.com. On this site, a Sri Lankan woman artist pretends to be a tea company. Maybe you know the best places who sell high quality tea?
    0 replies | 36 view(s)
  • lz77's Avatar
    Yesterday, 11:30
    I just yesterday fixed the last hard-to-find bug that occurred while adapting the dll to the conditions of the competition. My dlls are accepted for participation. The result is better than ULZ, but weak. Hope to take 5th place... What is the ratio without using ANS? What did you squeeze with ANS: literals and higher 8 bit offsets? How does zstd compress 1 GB to 280 MB in rapid test? :confused: Does it use any historical buffer for this?
    128 replies | 13100 view(s)
  • brispuss's Avatar
    Yesterday, 11:19
    brispuss replied to a thread Fp8sk in Data Compression
    Thanks. I re-ran the compression tests and confirmed that paq8px193fix2 and paq8pax90 do successfully compress the bfg.png file. But again the fp8* series compressors failed to compress the test file. Note that the tests were run under Windows 7 64 bit.
    77 replies | 6734 view(s)
  • a902cd23's Avatar
    Yesterday, 09:44
    a902cd23 replied to a thread Fp8sk in Data Compression
    Most likely because something has been corrected after version 98 on which fp8 is based. Even paq8px version 105 crashes on this png. I did not find v98 here or on github.
    77 replies | 6734 view(s)
  • brispuss's Avatar
    Yesterday, 04:19
    brispuss replied to a thread Fp8sk in Data Compression
    Running some test compression with fp8sk32. But fp8sk32 crashes (program stops/exits) when trying to compress a certain png file. Other png files seem to compress OK, so far. File enclosed below for reference (bfg.png). fp8 v6 also crashes when trying to compress this file. However, paq8px193fix2 and paq8pxd90 seem to compress this bfg.png file OK. So there must be something about this specific png file that fp8* series compressors do not understand or interpret correctly. The compressors are all run under an elevated command prompt (run as administrator). Why do the fp8* compressors fail on this bfg.png file? What is the solution to enable the fp8* compressors to successfully compress this file?
    77 replies | 6734 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 03:04
    let me check the hashxxk function on another jpeg files. yes you are right that i use hashxxk to get more compression ratio on f.jpg. which m1 context set ? let me fix them..thank you for your review gotty!!
    181 replies | 16150 view(s)
  • Gotty's Avatar
    Yesterday, 01:51
    Gotty replied to a thread Paq8sk in Data Compression
    @Surya, 1) You may want to remove the 5 additional contexts (those that have the hashxxk function). They don't give any benefit, they just slow down compression. A little more detail: The hashxxk function is a 32-bit hash function and is absolutely useless in a hash table that selects buckets based on the top bits of a 64 bit key. So this hashxxk function is really bad for this hash table and causes nothing but collisions in the (bottom part of the) hash table. I have a theory why you are using it: you probably noticed that using the (normal) hash function compression gets worse with those 5 contexts, so you gave a try to hashxxk and maybe it helped to gain 1-2 bytes of compression with f.jpg?. Well, the actual problem is that these 5 contexts are bad contexts (didn't tried them one-by-one though). But hashxxk ruins their effect (the "badness") so much that they actually become just noise and cause small random fluctuations to the compression ratio with different files. So these two bad guys just killed each other. 2) The new added m1 context sets (m1->set) are suspicious (I didn't test these, just looked at them briefly): you are overlapping the bits of the contexts so those joint contexts are not properly set up and their ranges are also incorrect. Would you like to fix them? If you do, you'll need to learn binary arithmetic. It's a must. Really. Please. My suggestions: Don't use a 32-bit hash function where a 64-bit hash function is required. (In paq8px almost everywhere you need a 64-bit hash.) Test all your changes one by one on many files to be able to find out which contexts are useful at all. Learn binary arithmetic in order to set up joint contexts properly. A note: The name of your improvement is not "jpeg recompression" but "jpeg compression". Sorry that I always criticize your work. I see your potential, and that's what makes me come and try helping you by showing where you can do better. And you can do better. I hope at least.
    181 replies | 16150 view(s)
  • Gotty's Avatar
    26th November 2020, 21:49
    zstd.exe -19 0269c.bin -o 0269c.zstd 2.40% (2142075000 => 51378475 bytes) Decompression speed is faster (3 sec vs 5 sec of LZMA), but compression ratio is unfortunately not that good as of LZMA.
    7 replies | 457 view(s)
  • pklat's Avatar
    26th November 2020, 20:39
    have you tried ZSTD? btrfs supports ZLIB, LZO and ZSTD
    7 replies | 457 view(s)
  • suryakandau@yahoo.co.id's Avatar
    26th November 2020, 18:34
    Paq8sk37 - improve jpeg recompression - the speed still fast using -8 option on f.jpg (darek corpus) is: paq8px197 81362 bytes paq8sk37 81192 bytes here is source code and binary file inside the package. ​
    181 replies | 16150 view(s)
  • Gotty's Avatar
    26th November 2020, 17:47
    I performed some tests on the sample file. It looks like the "best" option to use (keeping in mind that decompression should not take more than a couple of seconds) is LZMA. Could not find a better algorithm. When I select LZMA best options form 7zip, it crunches the file down to 43.3 MBs. To go under that size you need a much slower algorithm (talking about minutes) - I think that's not optimal for your case. xz has a similar but slightly worse ratio (44.2 MB), and slightly worse speed than 7zip's LZMA. I tried preprocessing the file (also the above mentioned bitmap and RLE-style encoding), but it helps only the decompression speed slightly, not the ratio (7zip's LZMA: 44.1MB).
    7 replies | 457 view(s)
  • fcorbelli's Avatar
    26th November 2020, 11:24
    fcorbelli replied to a thread zpaq updates in Data Compression
    I will try a more extreme approach, via a precomputed txt listing embedded. But extract speed can be a problem, because appended after the first version. Putting on head could be better, but cannot handle errors.
    2611 replies | 1118879 view(s)
  • SpyFX's Avatar
    26th November 2020, 10:12
    SpyFX replied to a thread zpaq updates in Data Compression
    add the fsize to 'FRANZOFFSET' and don't calculate fsize by reading h/block, I think it will speed up list command
    2611 replies | 1118879 view(s)
  • Jon Sneyers's Avatar
    26th November 2020, 00:45
    The benchmark_xl tool can visualize predictor choices, I don't know by heart what the parameter is to do it but it can do it. Modular mode has a very expressive bitstream and we're still actively trying to make a better and faster encoder for it. MA tree learning is quite complicated since we can vary the predictor per context, and the contexts are meta-adaptive. There will be a lot of room for encoder improvements, and even the current -s 9 -E 3 is still very far from exhaustive.
    35 replies | 2364 view(s)
  • Adreitz's Avatar
    26th November 2020, 00:34
    Thanks for the response, Jyrki. This is me being lazy, maybe, and not wanting to create yet another online account to make a single comment. Plus, I thought others would possibly see it as a personal preference rather than a bug or issue. I understand that the heuristics referred to as predictors 14 and 15 are able to select different predictors for different parts of each image, and it's good that P 15 now uses all of the available predictors rather than just two. However, I think it is surprising how often this heuristic is outperformed (for lossless) by simply forcing a single predictor for the entire image -- P 1, 2, and 5 seemed to come up often in my admittedly limited testing. (-s 9 is so slow that testing the various options exhaustively for more than a few images takes enough time that a new build of JXL is released and then I need to start over again!) Is there any tool available for visualizing the distribution of the predictors across a given JXL-encoded image? Or maybe something that visualizes the encoding expense in bits per pixel across the image, a la pngthermal? ​Aaron
    35 replies | 2364 view(s)
  • Jyrki Alakuijala's Avatar
    25th November 2020, 23:58
    Please, file an issue at the repository to wish for changes in documentation or code. Contexts can define the predictor to be used. Trying to define predictors at the image level is not going to be 'exhaustive' in that sense. --use_new_heuristics is something that is more at the level of an idea (or placeholder for coming code) than implemented yet. Ignore it for now. I think we will manage to get it working in about two months when it will likely replace the other basic heuristics for encoding. It will have a more holistic approach, reusing previous decisions instead of recomputing -- will be slightly better quality and faster encoding. Just doesn't work yet.
    35 replies | 2364 view(s)
  • fcorbelli's Avatar
    25th November 2020, 23:19
    fcorbelli replied to a thread zpaq updates in Data Compression
    Yes, with great results for years. I like it very much, with only a couple of defects. First are lack of fast check (almost done by my fork) Second very slow file listing. I am thinking about fake deleted file (zpaq 715 do not extract if date=0) I will make my little mft file, ignored by zpaq 715 but user by Zpaqfranz. In have to check if faster in listing by some experiment I have tried borg years ago but I do not like very much
    2611 replies | 1118879 view(s)
  • Adreitz's Avatar
    25th November 2020, 23:08
    Thanks for the continued JXL development! A couple things related to the release candidate: I think it's a bit annoying that predictors 14 and 15 are discussed in the help but are not addressable by -P 14 or -P 15. I get that they are heuristics rather than predictors per se, but having them explicitly addressable would make brute-force testing a lot easier (since removing -P entirely requires making manual changes to the command line or adding batch file lines, and it might also be interesting to some people to test predictor 14 with speed 9 or predictor 15 with lower speeds). If you don't want to do that for some reason, I'd recommend not referring to the heuristics using these numbers in the help, as it's misleading. Also, does --use_new_heuristics have any effect on lossless encoding? I couldn't see any change from enabling it. Aaron
    35 replies | 2364 view(s)
  • algorithm's Avatar
    25th November 2020, 23:01
    algorithm replied to a thread zpaq updates in Data Compression
    @fcorbelli do you use zpaq for backups? A very good alternative is borg. Deduplication is similar I think( content based chunking). Uses zstd, lzma, lz4 has encryption and you can even mount the archive and read it like a normal File System through FUSE.
    2611 replies | 1118879 view(s)
  • fcorbelli's Avatar
    25th November 2020, 22:49
    fcorbelli replied to a thread zpaq updates in Data Compression
    I can write A LOT of improvement in zpaq's archive format, much more like a Rdbms. But I do not like broken compatibility at all. And I am not the author, so a incompatible fork will be fully "unofficial". I am thinking about storing info in a special file, like mft. Zpaq will just extract this file, Zpaqfranz will use like embedded db. Maybe someone will present better ideas
    2611 replies | 1118879 view(s)
  • Shelwien's Avatar
    25th November 2020, 21:13
    Shelwien replied to a thread zpaq updates in Data Compression
    I think there's little point in preserving zpaq format. Forward compatibility is a nice concept, but imho not worth adding redundancy to archives. Also zpaq format doesn't support many useful algorithms (eg. faster methods of entropy coding), so I think its too early to freeze it in any case. Thus I'd vote for taking useful parts from zpaq (CDC dedup, compression methods) and designing a better archive format around it. There're more CDC implementations that just zpaq though, so maybe not even that.
    2611 replies | 1118879 view(s)
  • fcorbelli's Avatar
    25th November 2020, 20:30
    fcorbelli replied to a thread zpaq updates in Data Compression
    Last step: CRC32's of block during compression // Update HT and ptr list if (fi<vf.size()) { if (htptr==0) { htptr=ht.size(); ht.push_back(HT(sha1result, sz)); htinv.update(); fsize+=sz; } vf->second.ptr.push_back(htptr); As mentioned I'm trying to find a (simple) way to calculate the hashes of the blocks that make up a single file, during the add () phase. In this way I would not have to re-read the file downstream, calculating the CRC32 (operation that takes time) for storing. However it is not easy, at least for me, to engage an "interception" of the blocks. The "real" problem is not so much for new blocks, but for duplicate ones. In this case it should theoretically be decompressed (the duplicated block) to calculate the CRC32 (so I would say no, takes too long and too complex). The alternative that comes to mind is something similar to FRANSOFFSET, that is to store in blocks, in addition to the SHA1 code, also the CRC32. However, the ZPAQ "take the SHA1 data at the end of the block" mechanism seems to me rather rigid, with no concrete possibility of changing anything (compared to the 21 bytes of SHA1) without losing backwards compatibility (readSegmentEnd) // End segment, write sha1string if present void Compressor::endSegment(const char* sha1string) { if (state==SEG1) postProcess(); assert(state==SEG2); enc.compress(-1); if (verify && pz.hend) { pz.run(-1); pz.flush(); } enc.out->put(0); enc.out->put(0); enc.out->put(0); enc.out->put(0); if (sha1string) { enc.out->put(253); for (int i=0; i<20; ++i) enc.out->put(sha1string); } else enc.out->put(254); state=BLOCK2; } void Decompresser::readSegmentEnd(char* sha1string) { assert(state==DATA || state==SEGEND); // Skip remaining data if any and get next byte int c=0; if (state==DATA) { c=dec.skip(); decode_state=SKIP; } else if (state==SEGEND) c=dec.get(); state=FILENAME; // Read checksum ///// this is the problem, only SHA1 or nothing if (c==254) { if (sha1string) sha1string=0; // no checksum } else if (c==253) { if (sha1string) sha1string=1; for (int i=1; i<=20; ++i) { c=dec.get(); if (sha1string) sha1string=c; } } else error("missing end of segment marker"); } Ideas?
    2611 replies | 1118879 view(s)
  • fcorbelli's Avatar
    25th November 2020, 20:13
    fcorbelli replied to a thread zpaq updates in Data Compression
    I think this should be OK. void Jidac::write715attr(libzpaq::StringBuffer& i_sb, uint64_t i_data, unsigned int i_quanti) { assert(i_sb); assert(i_quanti<=8); puti(i_sb, i_quanti, 4); puti(i_sb, i_data, i_quanti); } void Jidac::writefranzattr(libzpaq::StringBuffer& i_sb, uint64_t i_data, unsigned int i_quanti,bool i_checksum,string i_filename,char *i_sha1) { /// experimental fix: pad to 8 bytes (with zeros) for 7.15 enhanced compatibility assert(i_sb); if (i_checksum) { assert(i_sha1); assert(i_filename.length()>0); // I do not like empty() assert(i_quanti<8); //just to be sure at least 1 zero pad, so < and not <= if (getchecksum(i_filename,i_sha1)) { puti(i_sb, 8+FRANZOFFSET, 4); // 8+FRANZOFFSET block puti(i_sb, i_data, i_quanti); puti(i_sb, 0, (8 - i_quanti)); // pad with zeros (for 7.15 little bug) i_sb.write(i_sha1,FRANZOFFSET); if (!pakka) printf("SHA1 <<%s>> CRC32 <<%s>> %s\n",i_sha1,i_sha1+41,i_filename.c_str()); } else write715attr(i_sb,i_data,i_quanti); } else write715attr(i_sb,i_data,i_quanti); }
    2611 replies | 1118879 view(s)
  • Jyrki Alakuijala's Avatar
    25th November 2020, 12:31
    Rare to have so much competence in a single event! Could be great to understand the similarities and differences between AVIF, JPEG XL, and WebP2, for example in a matrix form. Perhaps they are similar enough that a single encoder could generate content for each format. Perhaps they are similar enough that trans-coding could work without re-encoding (possibly only when a specific subset of features is used). From my viewpoint AVIF and WebP2 are close cousins and JPEG XL is the odd-one-out because of the following unique features: lossless JPEG transcoding an absolute colorspace is used allowing better preservation of dark areas simpler color handling as it has always the same HDR + high bit encoding: no encode-time decision for HDR or no HDR no encode-time decision 8/10/12 etc. bits (JPEG XL uses always float per channel internally) no encode-time decision about YUV420 no encode-time decision about which colorspace to use, always XYB progressive coding hybrid delta/normal palettization focus on best quality with internet distribution bit rates
    180 replies | 58943 view(s)
  • a902cd23's Avatar
    24th November 2020, 22:16
    a902cd23 replied to a thread Fp8sk in Data Compression
    Compressed file is excel97 Notes.xls 4266496 bytes fp8v6 fp8sk32 -1 Elapsed: 0:00:22,41 Elapsed: 0:00:26,25 -2 Elapsed: 0:00:22,86 Elapsed: 0:00:26,77 -3 Elapsed: 0:00:23,39 Elapsed: 0:00:27,35 -4 Elapsed: 0:00:23,40 Elapsed: 0:00:27,64 -5 Elapsed: 0:00:23,74 Elapsed: 0:00:28,29 -6 Elapsed: 0:00:24,14 Elapsed: 0:00:29,03 -7 Elapsed: 0:00:24,36 Elapsed: 0:00:29,78 -8 Elapsed: 0:00:24,73 Elapsed: 0:00:29,18 514 363 7.fp8sk32 514 753 8.fp8sk32 515 162 6.fp8sk32 515 233 5.fp8sk32 518 107 7.fp8 518 211 4.fp8sk32 518 541 8.fp8 518 941 6.fp8 518 951 5.fp8 521 909 4.fp8 527 435 3.fp8sk32 531 346 3.fp8 542 054 2.fp8sk32 546 251 2.fp8 565 925 1.fp8sk32 570 949 1.fp8
    77 replies | 6734 view(s)
  • Scope's Avatar
    24th November 2020, 20:34
    ImageReady event Avif For Next Generation Image Coding by Aditya Mavlankar https://youtu.be/5RX6IgIF8bw Slides The AVIF Image Format by Kornel Lesiński https://youtu.be/VHm5Ql33JYw Slides Webp Rewind by Pascal Massimino https://youtu.be/MBVBfLdh984 Slides JPEG XL: The Next Generation "Alien Technology From The Future" by Jon Sneyers https://youtu.be/t63DBrQCUWc Slides Squoosh! App - A Client Side Image Optimization Tool by Surma https://youtu.be/5s1UuppSzIU Slides
    180 replies | 58943 view(s)
  • SpyFX's Avatar
    24th November 2020, 19:01
    SpyFX replied to a thread zpaq updates in Data Compression
    if attr size == (FRANZOFFSET + 8) then checksum ok else if attr size == 8 checksum error ?
    2611 replies | 1118879 view(s)
  • fcorbelli's Avatar
    24th November 2020, 18:56
    fcorbelli replied to a thread zpaq updates in Data Compression
    Good only max 8 bytes (puti write uint64_t) are needed... BUT... I am too lazy to iterate for write an empty FRANZOFFSET block if something goes wrong i_sb.write(&pad,FRANZOFFSET); :D
    2611 replies | 1118879 view(s)
  • SpyFX's Avatar
    24th November 2020, 18:44
    SpyFX replied to a thread zpaq updates in Data Compression
    ок, my small fix :) del line const char pad = {0}; change i_sb.write(&pad,(8-i_quanti)); to puti(i_sb, 0, (8-i_quanti)); full code, no zero buffer if (getchecksum(i_filename, i_sha1)) { puti(i_sb, 8 + FRANZOFFSET, 4); // 8+FRANZOFFSET block puti(i_sb, i_data, i_quanti); puti(i_sb, 0, (8 - i_quanti)); i_sb.write(i_sha1, FRANZOFFSET); if (!pakka) printf("SHA1 <<%s>> CRC32 <<%s>> %s\n", i_sha1, i_sha1 + 41, i_filename.c_str()); } else { puti(i_sb, 8, 4); puti(i_sb, i_data, i_quanti); puti(i_sb, 0, (8 - i_quanti)); }
    2611 replies | 1118879 view(s)
  • fcorbelli's Avatar
    24th November 2020, 18:36
    fcorbelli replied to a thread zpaq updates in Data Compression
    ... no attribute no checksum ... (I never use noattribute!) I'll fix with a function (edited the previous post) I attach the current source EDIT: do you know how to "intercept" the blocks just before written to disk, in the add() function? I am trying to compute the CRC32 code of the file from the resulting compression blocks, sorted (as I do for verification). I would save re-reading the file from disk (by eliminating the SHA1 calculation altogether). In short: during add (), for each file and for each compressed block (even unordered) I want to save it on my vector, and then I work it. can you help me?
    2611 replies | 1118879 view(s)
  • SpyFX's Avatar
    24th November 2020, 17:52
    SpyFX replied to a thread zpaq updates in Data Compression
    yes, right, for your expansion, you should give 8 bytes for compatibility with zpaq 7.15, and then place the checksum +in your code, I don't see where the checksum is located if no attributes
    2611 replies | 1118879 view(s)
  • SpyFX's Avatar
    24th November 2020, 17:32
    SpyFX replied to a thread zpaq updates in Data Compression
    p/s sorry, i deleted my post
    2611 replies | 1118879 view(s)
  • fcorbelli's Avatar
    24th November 2020, 17:30
    fcorbelli replied to a thread zpaq updates in Data Compression
    Thank you, but... why? The attr does not have a fixed size, can be 3 or 5 bytes, or 0, or... 55 (50+5) bytes long. At least that's what it looks like from the Mahoney source. But I could be wrong. Indeed, more precisely, I think the solution could be "padding" to 8 the "7.15" attr (with zeros after the 3 or 5 bytes) then put "in the queue" my new attr block of 50 bytes. In this way the source 7.15 should always be able to take 8 bytes, of which 3 or 4 (the latter) are zero, to put in dtr.attr if (i<8) -7.15tr- << 40 bytes of SHA1 >> zero <<CRC >> zero 12345678 1234567890123456789012345678901234567890 0 12345678 0 lin00000 THE-SHA1-CODE-IN-ASCII-HEX-FORMAT-40-BYT 0 ASCIICRC 0 windo000 THE-SHA1-CODE-IN-ASCII-HEX-FORMAT-40-BYT 0 ASCIICRC 0 Seems legit? Something like that (I know, I know... not very elegant...) void Jidac::writefranzoffset(libzpaq::StringBuffer& i_sb, uint64_t i_data, int i_quanti,bool i_checksum,string i_filename,char *i_sha1) { if (i_checksum) /// OK, we put a larger size { /// experimental fix: pad to 8 bytes (with zeros) for 7.15 enhanced compatibility /// in this case 3 attr, 5 pad, then 50 const char pad = {0}; puti(i_sb, 8+FRANZOFFSET, 4); // 8+FRANZOFFSET block puti(i_sb, i_data, i_quanti); i_sb.write(&pad,(8-i_quanti)); // pad with zeros (for 7.15 little bug) if (getchecksum(i_filename,i_sha1)) { i_sb.write(i_sha1,FRANZOFFSET); if (!pakka) printf("SHA1 <<%s>> CRC32 <<%s>> %s\n",i_sha1,i_sha1+41,i_filename.c_str()); } else // if something wrong, put zeros i_sb.write(&pad,FRANZOFFSET); } else { // default ZPAQ puti(i_sb, i_quanti, 4); puti(i_sb, i_data, i_quanti); } } .... if ((p->second.attr&255)=='u') // unix attributes writefranzoffset(is,p->second.attr,3,checksum,filename,p->second.sha1hex); else if ((p->second.attr&255)=='w') // windows attributes writefranzoffset(is,p->second.attr,5,checksum,filename,p->second.sha1hex); else puti(is, 0, 4); // no attributes With observation I found a possible bug (what happens if the CRC calculation fails? Actually I should do it FIRST and if OK, insert a FRANZOFFSET block. In other words: if a file cannot be opened, then I could save space by NOT storing SHA1 and CRC32. Next release ...
    2611 replies | 1118879 view(s)
  • fcorbelli's Avatar
    24th November 2020, 17:27
    fcorbelli replied to a thread zpaq updates in Data Compression
    This is version 41 of zpaqfranz. It begins to look like something vaguely functioning. Using the -checksum switch (in the add command) stores BOTH SHA1 and CRC32 of each file inside the ZPAQ file. Those codes can be seen with the normal command l (list) in zpaqfranz. With zpaq 7.15 should be ignored without errors. The t and p commands test the new format. The first uses the CRC32 codes (in present), and if desired with -force also do a comparison from the filesystem files for a double check. About as fast as standard test for ZPAQ 7.15 The second p (as paranoid) does it on SHA1. In this case it's MUCH slower and uses MUCH more RAM, so that it often crashes on 32-bit systems. -verbose gives a more extended result list. Examples zpaqfranz a z:\1.zpaq c:\zpaqfranz\* -checksum -summary 1 -pakka zpaqfranz32 t z:\1.zpaq -force -verbose I emphasize that the source is a real mess, due to "injection" of different programs into the original file, so as not to differentiate it too much from "normal" zpaq. It should be corrected and fixed, perhaps in the future. To summarize: now zpaqfranz-41 can check file integrity (CRC32) in a (hopefully) perfectly backward compatible way with ZPAQ 7.15 EXE for Windows32 http://www.francocorbelli.it/zpaqfranz32.exe EXE for Win64 http://www.francocorbelli.it/zpaqfranz.exe Any feedback is welcomed.
    2611 replies | 1118879 view(s)
  • SpyFX's Avatar
    24th November 2020, 17:21
    SpyFX replied to a thread zpaq updates in Data Compression
    p/s sorry i blunted, my post needs to be deleted (:
    2611 replies | 1118879 view(s)
  • Jon Sneyers's Avatar
    24th November 2020, 12:35
    It may be a nice lossy image codec, but lossless it is not: Lenna_(test_image).png PNG 512x512 512x512+0+0 8-bit sRGB 473831B 0.010u 0:00.009 Lenna.bmp BMP3 512x512 512x512+0+0 8-bit sRGB 786486B 0.000u 0:00.000 Image: Lenna_(test_image).png Channel distortion: PAE red: 1028 (0.0156863) green: 1028 (0.0156863) blue: 1028 (0.0156863) all: 1028 (0.0156863) Lenna_(test_image).png=> PNG 512x512 512x512+0+0 8-bit sRGB 473831B 0.040u 0:00.020
    1 replies | 235 view(s)
  • SEELE's Avatar
    24th November 2020, 11:44
    Thanks Jon, theres a bug! i'll fix and then reupload (mods feel free to delete this post for now)
    1 replies | 235 view(s)
  • Shelwien's Avatar
    24th November 2020, 02:28
    Done
    4 replies | 1030 view(s)
  • L0laapk3's Avatar
    24th November 2020, 01:11
    I will (hopefully) be compressing the data right away before I ever write it away to storage. Using a bitmap to only have to store 1 bit for every zero is a great idea, I'll also look into RLE, maybe I can implement something myself, or maybe I'll rely on whatever compression library I end up using to do it for me. As for the use case: In onitama, at the beginning of the game you draw 5 cards from 16 cards randomly. For the rest of the game, you use the same cards. I'm currently hoping to generate all 4368 end-game tablebases for every combination of cards once, storing all this as compressed as I can. Once the game starts, I can unpack the tablebase for the right set of cards and from there on it fully fits into RAM so access speed is not a worry. I am not quite sure what the largest 8 bit value will be, as I have only generated the tablebase for a couple of card combinations so far. Generating it for all card combinations will be quite an undertaking. Currently this 8 bit value contains the distance to win or distance to loss (signed integer), divided by 8, so 8 optimal steps in the direction of the win correspond with a decrease in this value of 1. My implementation of the games AI should be able to navigate its way trough this. This means with the 8 bit value, I would be able to store distances to win of up to 1024 (any longer sequences I would just discard), which does seem a little on the high side, I might be able to get away with 6 or 7 bits, but I will only know for sure once I generate all the tablebases for every card combination. In the file above, I had not included all the losing values. Since for every winning board, there is exactly 1 opposite board with the same losing value, I can compute these when unpacking. So I can throw away the sign bit of the value, leaving only 7 bits. Further, I have the exact count of each value before I even start compressing: ~90% of non-zero values will be 1, ~8% will be 2, ... For the higher values this depends quite a bit on the card set, some barely have any high values at all and some have a lot. I can try creating some sort of huffman encoding for this myself, I'm not sure if this will hinder the actual compression later if I give it worse huffman encoded stuff than it could do itself with raw data.
    7 replies | 457 view(s)
  • Gotty's Avatar
    23rd November 2020, 23:30
    I'd like to hear more about your use case. I suppose you'd need to acquire fast the 8-bit information having the 31-bit key, right? Would such a lookup be performed multiple times during a single board-position evaluation, or just once? Or just occasionally during the endgame, when the number of pieces are below n, for example? So: how often do you need to fetch data from these maps? How much time is OK to fetch a 8-bit value? Nanoseconds, millisecond, one sec? What is the largest 8-bit value you may have in all your maps? 90 in the sample you provided. Any larger? It looks like only even numbers in the sample. Is it the case only for this file or is it general? Or some files have odd some even numbers (consistently)?
    7 replies | 457 view(s)
  • CompressMaster's Avatar
    23rd November 2020, 22:45
    @shelwien, could you delete member BestComp? That account belongs to me (I created it when we discussed one email address, two accounts issue), but I don´t remember registration email and I dont want to have this account at all. Thanks.
    4 replies | 1030 view(s)
  • Shelwien's Avatar
    23rd November 2020, 20:14
    1. One common method for space reduction is bitmaps: add a separate bit array for (value>0) flags, this would result in 8x compression of zero bytes. Of course, you can also assign flags to pages of convenient size, or even use a hierarchy of multiple bitmaps. 2. There're visible patterns in non-zero values, eg. ("04" 29 x "00") x 16. Maybe use RLE (or any LZ, like zlib,zstd) for non-zero pages.
    7 replies | 457 view(s)
  • pklat's Avatar
    23rd November 2020, 17:28
    I'd just use a 'proper' filesystem that supports sparse files on 'proper' OS :) On windows, perhaps you could use on-the-fly compression, I think it doesnt support sparse files. tar supports them, iirc.
    7 replies | 457 view(s)
  • lz77's Avatar
    23rd November 2020, 15:37
    Hm, when I run TestGLZAdll.exe in test.bat for my compress dll renamed to GLZA.dll, I get error "TestGLZAdll.exe - application error 0xc000007b"... The test works correct with original GLZA.dll. Oh, the original GLZA.dll also contains both decodeInit & decodeRun, but I have separate dll's for encoding and decoding. Perhaps this is the reason for the error...
    128 replies | 13100 view(s)
  • L0laapk3's Avatar
    23rd November 2020, 14:14
    Hi all, let me start off by mentioning that I am a complete noob to compression. I understand basic concepts about information density but thats about it. I'm currently building tablebases for a board game (onitama) and the files I end up with are quite large. the table is an unordered map structure with a unique 31 bit board key and 8 bit distance to win value. There are around 2E8 entries in the map. The best results I've managed is by removing the key altogether and storing a large 2^31 address byte array. Since most of the keys don't exist, the array consists mostly of zeroes. This results in a much larger uncompressed file of 2GB, however 7z managed to compress it to just 47MB, 2% of its original volume. My question is the following: are there more efficient ways to store such sparse data? My goal is to store around 4400 of these, which currently puts the total size in excess of 200GB. I have included one such files for testing: https://we.tl/t-cYtvo5Fw90 (47MB) Many thanks L0laapk3
    7 replies | 457 view(s)
  • Dresdenboy's Avatar
    23rd November 2020, 11:22
    :_good2:
    128 replies | 13100 view(s)
  • lz77's Avatar
    23rd November 2020, 11:04
    I just saw where the error is: It happened while adapting my subroutine to the conditions of the competition. I confused the name of one variable.
    128 replies | 13100 view(s)
  • suryakandau@yahoo.co.id's Avatar
    23rd November 2020, 02:54
    using -8 option on enwik6 result is: paq8sk35 196112 bytes paq8sk36 196112 bytes here is paq8sk35 and paq8sk36 rebuilt using -DNDEBUG flag
    181 replies | 16150 view(s)
  • suryakandau@yahoo.co.id's Avatar
    23rd November 2020, 02:26
    @darek yes gotty is right about paq8sk34 , paq8sk34 had a serious flaw. @gotty i will put back -DNDEBUG flag, fix CHANGELOG and README. Thank you Darek!! thank you Gotty!!
    181 replies | 16150 view(s)
  • Cyan's Avatar
    23rd November 2020, 00:46
    Cyan replied to a thread ARM vs x64 in The Off-Topic Lounge
    Phoronix made an early comparison of M1 cpu performance on Mac mini: https://www.phoronix.com/scan.php?page=article&item=apple-mac-m1&num=1 Results are impressive, especially for the kind of power class M1 works at. Even when running `x64` code with an emulation layer, M1 is still faster than its ~2 years old Intel competitor (i7-8700B). The downside of this study is that, since it only uses Mac Mini platforms, it isn't comparing vs Intel's newer Tiger Lake, which would likely show a more nuanced picture. Still, it's an impressive first foray into PC territory.
    17 replies | 2062 view(s)
  • Gotty's Avatar
    23rd November 2020, 00:15
    Gotty replied to a thread Paq8sk in Data Compression
    @Darek, I think, sk34 is not worth testing - it had a serious flaw. It is fixed in sk35.
    181 replies | 16150 view(s)
  • Gotty's Avatar
    23rd November 2020, 00:01
    Gotty replied to a thread Paq8sk in Data Compression
    I'm glad you found the problem of the assert. Now I didn't look too deeply (into v35), found mostly cosmetic issues so far. Also the compression looks good (for text files); for binary files there are fluctuations but some files are significantly worse. You should REALLY test with many file types - as I mentioned a couple of times and Darek also has just wrote the same in his latest post. Did you put back the -DNDEBUG flag when publishing the release build? (With the asserts in effect speed is worse.) You'll need to fix CHANGELOG as it is missing the log entry for paq8px_v197. README: as I mentioned you cannot just replace px to sk. From the README one would get that paq8sk started in 2009. The link to the thread is also incorrect. Please read the README carefully (and maybe fix it). There are also many references to paq8px in many other files. The most important ones are probably the Visual Studio solution file and CMakeLists.txt - they are still looking for paq8px.cpp. Do you plan to fix them?
    181 replies | 16150 view(s)
  • suryakandau@yahoo.co.id's Avatar
    22nd November 2020, 22:52
    Yes sir I can provide it
    181 replies | 16150 view(s)
  • Darek's Avatar
    22nd November 2020, 22:03
    Darek replied to a thread Paq8sk in Data Compression
    @suryakandau -> at first: paq8sk34 got quite interesting results for textual files on my testbed -> not big differences but best scores for all textual files overall!!!! Other files scores are worse than paq8px_v195 or paq8px_v196 which are the base (I understand) for paq8sk34. However, I'll test paq8sk35 and paq8sk36 next. You should have to put more attention to overall compression ratio. It's not the best practice when one or two kind of files ratios are improved by the cost of other files. My request: could you provide comparison of results of enwik6 for paq8sk34, paq8sk35 and paq8sk36? I'll prepare to test one of them on enwik8 (at now there is my possibility to test such lenght of file only) and I need to choose the best :)
    181 replies | 16150 view(s)
  • Shelwien's Avatar
    22nd November 2020, 17:17
    Kennon Conrad just posted his framework in other thread - https://encode.su/threads/1909-Tree-alpha-v0-1-download?p=67549&viewfull=1#post67549 You can rename your dll to GLZA.dll and see what happens. Reposting the binaries here, because he missed a mingw dll.
    128 replies | 13100 view(s)
  • lz77's Avatar
    22nd November 2020, 16:47
    Thanks. If I knew in June where is the right category for my participation, I would make a preprocessor and write these dll's in FASM for Win64... Unfortunately I haven't practiced in C yet. Maybe someone will send me this C test program for CodeBlocks... Here is my Delphi 7 example. It gives dll contents of file named "bl00.dat" and writes the compressed data to a file named "packed". P.S. packedsize is a var parameter and passed to dll by pointer. Here's a declaration from the dll: type pb = ^byte; ... function encodeRun(inSize: dword; inPtr: pb; var outSize: dword; outPtr: pb; cmprContext: pointer): longint; cdecl;
    128 replies | 13100 view(s)
More Activity