# Activity Stream

• Today, 05:58
I do read the counting arguments and everything about impossible way to compress random data. But i just curious, why human cant solve it. It maybe not impossible, its just that we didnt find the correct way. I currently try to find my mistake if the method isnt going to work. So far, this is the result : Im talking about the idea about output table size or compressed file size. If the ID only contain 2 digits lenght, the compressed file size is : N digit ID x 16 x 256 2 x 16 x 256 = 8.192 bytes or 8 kb The input file to compress must not smaller than 8kb. Lets take example, i have a file with size 10.000 bytes. On the hex editor it have 10.000/16 column = 625 rows. To create the pattern on the rows i need (2^625)-1 = 139.234.637.988.958.594.318.883.410.818.490.335.842.688.858.253.435.056.475.195.084.164.406.590.796.163.250.320.615.014.993.816.265.862.385.324.388.842.602.762.167.013.693.889.631.286.567.769.205.313.788.274.787.963.704.661.873.320.009.853.338.386.432 I have that count of pattern (result using big integer calculator : https://defuse.ca/big-number-calculator.htm). And now i will try to make sure that the 2 digit ID are able to store that many pattern. Using any possible character to generate the ID. By observing the available character manually, i notice that there are about 216 character to create the ID (each character size is 1 byte). To create ID for 2 digits using possible combination of 216 character, i can use formula n^r, so i get : 216^2 = 46.656 ID Great, im out of pattern's ID stock. And yes it fail and it is the mistake. Now i still try to think another possibilities. I have several alternative formula to reduce pattern and expand the stock of the ID. I need more time to create small experiment. Any kind of help will be appreciated. Thank you
170 replies | 73456 view(s)
• Yesterday, 18:32
Darek replied to a thread Paq8pxd dict in Data Compression
Scores of enwik8/9 compressed by paq8pxd v68 - nice gain (0.08-0.11%), however still about 0,3% worse scores than v63 version. 16'309'012 - enwik8 -s8 by Paq8pxd_v63, time 9'240s 15'967'201 - enwik8 -s15 by Paq8pxd_v63, time 8'880s 16'63'7302 - enwik8.drt -s15 by Paq8pxd_v63, time 11'837s 126'597'584 - enwik9_1423 -s15 by Paq8pxd_v63, time 98'387s 16'374'223 - enwik8 -s8 by Paq8pxd_v67_AVX2, +0.40% to v63score, time 6'431s, 16'048'070 - enwik8 -s15 by Paq8pxd_v67_AVX2, +0.51% to v63 score, time 6'643s 16'774'998 - enwik8.drt -s15 by Paq8pxd_v67_AVX2, +0.83% to v63 score, time 8'413s 127'063'602 - enwik9_1423 -s15 by Paq8pxd_v67_AVX2, +0.37% to v63 score, time 66'041s 16'364'165 - enwik8 -s8 by Paq8pxd_v68_AVX2 16'033'591 - enwik8 -s15 by Paq8pxd_v68_AVX2 16'755'942 - enwik8.drt -s15 by Paq8pxd_v68_AVX2 126'958'003 - enwik9_1423 -s15 by Paq8pxd_v68_AVX2
641 replies | 253885 view(s)
• Yesterday, 18:28
Ok, then I have some parameters to test.
83 replies | 7925 view(s)
• Yesterday, 15:23
No, they exist since the first version of NNCP.
83 replies | 7925 view(s)
• Yesterday, 14:40
@Mauro - looks promising :). Did you implement it in new NNCP version? Update the scores for K.WAD and L.PAK - I've tested these files with hidden size 1024 settings and got another 45KB of gain. In total my optimization gives 7.2% compared to default option (almost 1.1MB of gain). In this way NNCP got 13'th place from all compressors - w/o any models it's impressive!
83 replies | 7925 view(s)
• Yesterday, 11:18
Whether or not you have a method that beats TurboRLE, it's clear you've prodded dnd into improving his own code so it's still a win for data compression. Thanks. :-)
8 replies | 571 view(s)
• Yesterday, 11:15
"hard to compress it without occupying more information than original". You still don't understand unfortunately. Not hard, but impossible. (Possible on some files, but not on all and over time you'll never win). Please read and understand the link I listed before: http://www.faqs.org/faqs/compression-faq/part1/section-8.html I'm done with this thread. I've tried to help you, honestly. I'm not here to poke fun, but to save you from yourself. You will never succeed at random data compression and the evidence for why not is right there in that link. Non-random data... now that's an entirely different and more fruitful story. JPEG (even uncompressed JPEG) fits into that category.
170 replies | 73456 view(s)
• Yesterday, 08:21
TurboRLE: Run Length Encoding update: - Better compression - Skylake benchmark updated - ARM NEON support - New benchmark on ARM Cortex A73 ODROID-N2
25 replies | 9525 view(s)
• Yesterday, 05:30
Hello guys .my new algorithm compress 5mb file to 13bits.
0 replies | 84 view(s)
• Yesterday, 00:35
Here’s what I learned about ZIP compression with ECT: 1. On PNG files, advpng -i 10000 almost never beats ECT -60500 and if so, only by one or two bytes. That’s obvious because both are built on Zopfli, with ECT being much improved. 2. I assumed that the same was true for ZIP files, but to my surprise: Some Libre Office document: 38,922 B 7-Zip Ultra ZIP: 12,926 B This ZIP file is our start. ECT_x64.exe (the Windows version from 2017-08-15) -9 -zip --disable-png --disable-jpg: 12,356 B advzip.exe --recompress --shrink-insane: 12,341 B Leanify.exe -d 1: 12,337 B Something is amiss: ECT directly calls Leanify’s code and should at least reach the same compression! Its Zopfli is also more optimized than advzip’s, so I didn’t expect it to perform worse. Checksum on archive content is identical, so no cheating on the original data here. This is a very small difference, but it gets larger as iteration counts increase. I have been seeing a 1% improvement on another file last weekend and that got me started in the first place. And it brings us directly to the situation SnakeSnoke described: You can repeat the Leanify-ECT-advzip loop almost endlessly to gain more compression. 3. I downloaded the current ECT source and compiled it (rough ride because I’m on Windows with Visual Studio). Just like Felix said, ECT’s ZIP optimization is based on Leanify. This in turn calls Zopfli. It’ll take some time for me to figure out where it goes wrong, but I just wanted to say: Be cautious; ECT is not as good on ZIP as I expected it to be! But it’s probably just a small bug or oversight :)
399 replies | 103302 view(s)
• 19th August 2019, 23:00
Do you know that if you don't preprocess the 200K JPEG file with your custom preprocessor then it is compressed much better? I tried it with paq8px without the JPEG model): 185K after your preprocessing, 176K in the original form. (Unfortunately I don't have the resources to run cmix.) It means, your custom preprocessing hurts compression.
170 replies | 73456 view(s)
• 19th August 2019, 22:51
It is still not random. I can just repeat what JamesB told you: JPEG files are not random. What do you mean here? (Preprocessing is not compression.) I'm confused here. Faster than...? Thanx for asking. You posted an old version in exe format. You should have posted a link to the source. But don't worry, I have it. As I see the source code (which is more than 1 MB), it would take too much time to study the algorithm and engineer a proper anti-file. For comparison the source code for paq8px is around half a megabyte, I'm studying the latter for some time, and I'm still not at 100% :rolleyes:. But my understanding was enough to create an anti file (and it was straightforward in that case). For BSC it's not that simple as with paq8px because there is a transformation step. Anyway, I verified: it tries to compress the data with using p<>0.5 probabilities. Unfortunately I don't really have months to spend for analyzing, reverse engineering the data flow. I'll need to give up on this challenge, sorry. Also, you would need to verify if the anti-files are really anti-files. Would you be able to do that?
170 replies | 73456 view(s)
• 19th August 2019, 22:37
TurboBench will store the exact peak memory usage for compression and decompression in the result file ".tbb". This works only when compiled on linux and without static linking (default Turbobench mode). Turbobench can only track memory allocated dynamically. Memory on the stack or allocated with "mmap" (memory mapped) cannot be tracked. No changes in the source code of the codecs is necessary. Here an example of memory usage: Static/Dynamic web content compression benchmark
154 replies | 37561 view(s)
• 19th August 2019, 21:29
1. I hacked the server to find the info, then @encode paid for it. 2. Dunno. I had to install new version of forum engine, since old one didn't work on new server. 3. Nope, we tried everything I think. In any case, at this point we'd not move it back to his server even if he reappears.
40 replies | 1525 view(s)
• 19th August 2019, 20:32
Darek replied to a thread Paq8pxd dict in Data Compression
Scores of 4 corpuses for paq8pxd v68. - Canterbury Corpus got the best score for paq8pxd version but other corpuses not. - Calgary loses 700 bytes to best v62 score. - MaximumCompression loses 12KB to v62 version. - The biggest lose to v62 version becames from Silesia corpus = 83KB (0.28%) - moslty on Mozilla (27KB), oofice (15KB) and Webster (31KB) files. Similarly to v67 version this version have the same issue - for option -s15 and for some files (G.EXE, H.EXE from my filetest and AcroRd32.exe, FlashMX.pdf, MSO97.DLL from Maximum Compression) program didn't finish the compression process and exit w/o crash... then for Maximum compression these three files (AcroRd32.exe, FlashMX.pdf, MSO97.DLL) are compressed with -s14 option.
641 replies | 253885 view(s)
• 19th August 2019, 18:51
@Shelwien, 1.As for the old domain encode.ru, who extended that domain without access to it? It´s paid till 2020. 2.Icons on attached files are different. 3.Any chances to contact webmaster using his registration email address? Or if has a website, via registrant email or registrar company?
40 replies | 1525 view(s)
• 19th August 2019, 18:29
We have limited byte count - 256 possible values. That´s big advantage (and even for text files with an 0% of repeated symbols - i.e. every character is unique). That means at every 256-th position, there´s repeated sequence, but it´s hard to compress it without occupying more information than original. So random isn´t random, problem is how to compress it without occupying more info than original... See my second started thread, test it on your JPEG files, post some results and then let´s discuss it deeper.
170 replies | 73456 view(s)
• 19th August 2019, 18:13
As I said before: A) JPEG isn't random. It's compressible because JPEG hasn't managed to entropy encode everything perfectly (for starters most JPEGs are huffman encoded instead of arithmetic). B) Reliable and consistent compression of random data is impossible. I don't mean hard, I mean mathematically provably impossible. You may have a valid compression method for non-random data, but please think about how your assertions look. While you're claiming to be able to compress random data everyone with any knowledge in data compression will be thinking you're a crackpot. That means even if you do have a good algorithm, it'll be completely ignored and rightly so.
170 replies | 73456 view(s)
• 19th August 2019, 18:06
Well, I´m able to compress already compressed JPEGs (i.e. random data) by my custom preprocessing method (it´s not an algorithm) without decompression at all - see my second started thread. 200 KB JPEG can be losslessly shrinked down to 174KB by CMIX without knowing that it´s JPEG image. You´re right. Patterns in non-random data. I´ve used my custom data preprocessing method to minimize randomness in original file. As for incompressibility of files, let´s wait for my custom data compression algorithm - BestComp. Maybe I have too overestimated expectations, but never say never...
170 replies | 73456 view(s)
• 19th August 2019, 16:38
Do you have any benchmarks that track CPU and RAM usage during compression and decompression?
154 replies | 37561 view(s)
• 19th August 2019, 13:59
Thank you. Downloaded the random data from the source https://archive.random.org/download?file=2019-08-19.txt For the non random data, file size too big and take a long time to download. If the random data already downloaded, it is now become offline file and the bit are not changed except the file can modified itself. So if the bit on the file aren't changed is it still random? Because when the bit just stay like that, i still able to create the pattern on them.
170 replies | 73456 view(s)
• 19th August 2019, 13:56
TurboBench - Compression Benchmark updated. - All compressors updated to the latest version - rle8 added - TurboRLE: Run Length Encoding improved + new benchmarks Some compressors like rle8 for example must be manually downloaded Example : git clone --recursive git://github.com/powturbo/TurboBench.git cd TurboBench git clone git://github.com/rainerzufalldererste/rle8.git make RLE8=1 To make a formatted table in encode.su like below or like in this post H1|H2|H3 A|B|C D|E|F use: ./turbobench -p5 -o data.tbb "data.tbb" is the turbobench output file after benchmarking the file "data"
154 replies | 37561 view(s)
• 19th August 2019, 12:54
Random data: https://archive.random.org/binary Not random: https://dumps.wikimedia.org/other/static_html_dumps/current/en/html.lst
170 replies | 73456 view(s)
• 19th August 2019, 12:28
My apologize, can anyone give me example of random data and non random data file in attachment? I will try to understand the diference. Thank you :_good2:
170 replies | 73456 view(s)
• 19th August 2019, 11:52
No it doesn't. That is provably impossible. Please read http://www.faqs.org/faqs/compression-faq/part1/section-8.html Don't bother trying to explain why your algorithm is different - it won't be. The only way any tool works, even things like CMIX, is by spotting patterns in non-random data and exploiting them. That means no tool can compress every file, but that's fine as we don't generally want to compress random data. You may well have a useful tool, but if so focus on where it is useful (people's actual data) and not where it is not (random data).
170 replies | 73456 view(s)
• 19th August 2019, 06:46
Don't think anybody said it was from google. but chacha20 is getting a lot of support from google https://security.googleblog.com/2014/04/speeding-up-and-strengthening-https.html https://vikingvpn.com/blogs/security/google-is-pushing-new-cipher-suites-all-about-chacha20-and-poly1305 i just find it funny that in the AES competion both the runner ups (serpent and twofish) get a lot of attention But as even the finalist from Estream, besides salsa20 (chacha20), are like forgotten myths.
4 replies | 136 view(s)
• 19th August 2019, 06:12
FYI, neither Salsa20 nor ChaCha is from Google. They're both from Daniel Bernstein.
4 replies | 136 view(s)
• 18th August 2019, 23:59
Sebastian replied to a thread paq8px in Data Compression
I have tested the new audio model from Paq8px_v181 (-6) for 16-bit stereo files against OptimFrog v5.1 (--preset max) and Sac v0.5.0 on some widely used testsamples. Performance is impressive but overall still behind OFR and Sac. Numbers are "Bits per Sample", where smaller is better. Name|SAC|OFR|PAQ ATrain|7.092|7.156|7.376 BeautySlept|7.596|7.790|7.846 chanchan|9.704|9.778|9.740 death2|5.175|5.465|5.224 experiencia|10.885|10.915|10.985 female_speech|4.417|4.498|4.691 FloorEssence|9.188|9.409|9.506 ItCouldBeSweet|8.226|8.310|8.362 Layla|9.582|9.571|9.742 LifeShatters|10.778|10.808|10.890 macabre|9.018|9.026|9.266 male_speech|4.323|4.256|4.532 SinceAlways|10.355|10.409|10.479 thear1|11.387|11.400|11.496 TomsDiner|7.034|7.108|7.087 velvet|9.875|9.990|10.059 Mean|8.415|8.493|8.580
1664 replies | 475486 view(s)
• 18th August 2019, 19:40
compiling under TurboBench: Compression Benchmark To enable the compilation under gcc, I've modified "rle8.h" by surrounding it with 'extern "C" ' and renamed "_xgetbv" (already defined in gcc) in "rle8_cpu.c". To build turbobench, clone rle8 under the turbobench directory, then use "make RLE8=1". git clone --recursive git://github.com/powturbo/TurboBench.git cd TurboBench git clone git://github.com/rainerzufalldererste/rle8.git (modify rle8.h and rle8_cpu.c as described) make RLE8=1
8 replies | 571 view(s)
• 18th August 2019, 19:26
Thanks for integrating rle8 into TurboBench! :) Seems like I need to do some more optimizations.
8 replies | 571 view(s)
• 18th August 2019, 14:53
It is relatively easy to remove all artefacts, but very difficult to do it in away that image features, particularly subtle textures like stone, skin, tapestry, low intensity scratches and cracks, don't suffer. In PIK and JPEG XL we have payed a lot of attention on this since -- unlike in video -- there are no future frames that would eventually improve on those missing textures.
41 replies | 4741 view(s)
• 18th August 2019, 13:29
Only PIK preserved skin texture:
41 replies | 4741 view(s)
• 18th August 2019, 13:03
TurboRLE: Run Length Encoding vs RLE8:a Fast 8 bit RLE I've integrated rle8 into TurboBench: Compression Benchmark and tested this with different distributions. TurboRLE is pareto for all benchmarked files. See the full Benchmark
8 replies | 571 view(s)
• 18th August 2019, 13:01
TurboRLE: Run Length Encoding vs RLE8:a Fast 8 bit RLE I've integrated rle8 into TurboBench: Compression Benchmark and tested this with different distributions. The results are sorted by compression ratio. TurboRLE is pareto for all benchmarked files. TurboRLE compress up to 200% better than the next rle8 and compress several times faster. TurboRLE decompress enwik9bwt more than 10 times faster than the next best rle8. TurboBench: - Aug 19 2019 - CPU Skylake i7-6700 3.4GHz. C Size|ratio%|C MB/s|D MB/s|Name|File (MB=1.000.0000) 2623680| 0.6| 2074|11113|trle |girl.bmp 4148455| 1.0| 2063|12126|srle 0 |girl.bmp 4744806| 1.2|10766|12343|srle 8 |girl.bmp 5901239| 1.5| 861|12083|rle8 1 |girl.bmp 8431844| 2.1| 7368|12693|srle 16 |girl.bmp 13722311| 3.4|11090|13188|srle 32 |girl.bmp 16658439| 4.1| 855|12335|rle8 2 single |girl.bmp 19839711| 4.9|16269|13733|srle 64 |girl.bmp 27273225| 6.8| 724| 7320|rle8 3 ultra |girl.bmp 37731711| 9.3| 725| 5346|rle8 4 ultra single |girl.bmp 403920054|100.0|13978|14001|memcpy |girl.bmp C Size|ratio%|C MB/s|D MB/s|Name|File (MB=1.000.0000) 73108990| 17.4| 754| 2983|trle |1034.db 84671759| 20.2| 741| 5065|srle 0 |1034.db 88666372| 21.2| 437| 2668|rle8 2 single |1034.db 88666376| 21.2| 456| 2668|rle8 1 |1034.db 92369164| 22.0| 1019| 5860|srle 8 |1034.db 104249934| 24.9| 412| 2919|rle8 4 ultra single |1034.db 104249934| 24.9| 428| 2919|rle8 3 ultra |1034.db 113561548| 27.1| 2028| 7114|srle 16 |1034.db 136918311| 32.7| 3588| 9026|srle 32 |1034.db 165547365| 39.5| 5972|10120|srle 64 |1034.db 419225625|100.0|13938|14017|memcpy |1034.db C Size|ratio%|C MB/s|D MB/s|Name|File (MB=1.000.0000) 375094084| 37.5| 470| 1742|trle |enwik9bwt (180510948 w/ TurboRC,0) 415597104| 41.6| 448| 3486|srle 0 |enwik9bwt 419263924| 41.9| 538| 4256|srle 8 |enwik9bwt 487430623| 48.7| 1347| 6287|srle 16 |enwik9bwt 549202860| 54.9| 2780| 8238|srle 32 |enwik9bwt 577685254| 57.8| 261| 675|rle8 1 |enwik9bwt (229693376 w/ TurboRC,0) 601281880| 60.1| 256| 1031|rle8 3 ultra |enwik9bwt 605759578| 60.6| 5356| 9471|srle 64 |enwik9bwt 886471282| 88.6| 366| 4765|rle8 2 single |enwik9bwt 891419852| 89.1| 367| 5464|rle8 4 ultra single |enwik9bwt 1000000008|100.0|13931|13926|memcpy |enwik9bwt TurboRC : Order 0 bitwise range coder (see Entropy Coding Benchmark)
25 replies | 9525 view(s)
• 18th August 2019, 11:48
The baseline is still to be better than JPEG. A baseline that wasn't met with lots of new codecs at higher BPP end. https://www.reddit.com/r/AV1/comments/cpwz1v/will_avif_be_superseded_by_jpeg_xl/?sort=new compares PIK against AVIF and BPG. (JPEG XL is around 5 % better than the already opensourced PIK codec.)
41 replies | 4741 view(s)
• 18th August 2019, 01:07
encode replied to a thread CHK Hash Tool in Data Compression
Okay, it's the biggest update ever! Please welcome - CHK v3.00! https://compressme.net/ :_banana2:
179 replies | 77343 view(s)
• 17th August 2019, 19:13
jethro replied to a thread Zstandard in Data Compression
Thanks Cyan *** zstd command line interface 64-bits v1.4.2, by Yann Collet *** Win 10
335 replies | 112042 view(s)
• 17th August 2019, 17:14
I have changed the rle8_ultra implementation to have a maximum length of 32 (allowing to not even iterate over the bytes to set, but rather just _mm256_storeu_si256 the symbols and move the write pointer forward by the count) This is obviously slower and less efficient for files with symbols that occur quite often, but beats TurboRLE in terms of speed on my encoded image test file (by a tiny amount). I believe the results (both speed and efficiency wise) can be improved by selecting certain modes adaptively throughout the file. (e.g. max length 32; use unused symbol for the RLE symbol to not stop scanning on single occurrences of the symbol in question; etc.) Obviously TurboRLE still beats rle8 and rle8_ultra easily when not using an 8 bit RLE. New results: Encoded image file: Mode | Compression Rate | Compression Speed | Decompression Speed | Compression rate of result (using rans_static_32x16) - | 100 % | - | - | 12.861 % rle8 Ultra (Single Symbol Mode) | 24.1 % | 424.73 MB/s | 2564.1 MB/s | 43.793 % rle8 Normal (Single Symbol Mode) | 19.9 % | 444.52 MB/s | 2261.3 MB/s | 46.088 % rle8 Ultra | 24.2 % | 425.2 MB/s | 1559.79 MB/s | 43.681 % rle8 Normal | 19.9 % | 446.36 MB/s | 1473 MB/s | 45.944 % trle | 17.2 % | 699.13 MB/s | 1707.79 MB/s | - srle 0 | 18.7 % | 686.07 MB/s | 2522.70 MB/s | - srle 8 | 18.7 % | 983.68 MB/s | 2420.88 MB/s | - mrle | 19.7 % | 208.05 MB/s | 1261.91 MB/s | - The single run length encodable symbol file I had also used previously (rle_ultra is a lot slower here, but normal rle still beats TurboRLE barely) Mode | Compression Rate | Compression Speed | Decompression Speed | Compression rate of result (using rans_static_32x16) - | 100 % | - | - | 33.838 % rle8 Normal | 56.7 % | 528.59 MB/s | 5125.2 MB/s | 41.538 % rle8 Ultra | 58.9 % | 506.46 MB/s | 4417.12 MB/s | 43.64 % trle | 55.9 % | 307.96 MB/s | 2327.31 MB/s | - srle 0 | 56.5 % | 306.58 MB/s | 4975.80 MB/s | - srle 8 | 56.5 % | 354.67 MB/s | 4983.12 MB/s | - mrle | 56.7% | 135.02 MB/s | 1837.48 MB/s | -
8 replies | 571 view(s)
• 17th August 2019, 16:31
How good their compressors/decompressors are tell us how "intelligent" the programmers are. Understanding "context" is one measure. I wonder if investigators in other areas who are good with context will perform precisely as good in data compression.
170 replies | 73456 view(s)
• 17th August 2019, 13:15
Cyan replied to a thread Zstandard in Data Compression
Hi @Jethro This command line you present should have worked. is a usual construction that is known and well tested. I can't explain from this snippet why it would not work for you ....
335 replies | 112042 view(s)
• 17th August 2019, 10:10
jethro replied to a thread Zstandard in Data Compression
Yes, this is the dict.1 file zstd --train .\train\* -o dict.1 Trying 5 different sets of parameters k=1998 d=8 f=20 steps=4 split=75 accel=1 Save dictionary of size 112640 into file dict.1
335 replies | 112042 view(s)
• 16th August 2019, 23:40
Key Negotiation of Bluetooth Attack, Breaking Bluetooth Security: https://knobattack.com/ New Bluetooth Vulnerability Lets Attackers Spy On Encrypted Connections: https://thehackernews.com/2019/08/bluetooth-knob-vulnerability.html
0 replies | 62 view(s)
• 16th August 2019, 23:12
Shelwien replied to a thread Zstandard in Data Compression
zstd dictionary file is not raw, you have to build it first: Dictionary builder : --train ## : create a dictionary from a training set of files --train-cover : use the cover algorithm with optional args --train-fastcover : use the fast cover algorithm with optional args --train-legacy : use the legacy algorithm with selectivity (default: 9) -o file : `file` is dictionary name (default: dictionary) --maxdict=# : limit dictionary to specified size (default: 112640) --dictID=# : force dictionary ID to specified value (default: random)
335 replies | 112042 view(s)
• 16th August 2019, 22:04
You´re right. JPEG isn´t random if it´s decompressed. But I´m talking about already compressed data. Randomness in IT does not exists for me, they´re all compressible, but hardly. I´m aware how JPEG REcompressor (such as stuffit or paq) works. But my custom data preprocessing method is able to compress even random data WITHOUT recompression i.e. it´s not neccessary to decompress JPG file. And it´s noticeably faster, although compression ratio is not that good. 200KB original (preprocessed to almost 4MB - preprocessed, not decompressed, see the difference) to 174 KB lossless is possible with CMIX. @Gotty, what´s your progress with BSC algorithm anti-files?
170 replies | 73456 view(s)
• 16th August 2019, 19:51
Gotty replied to a thread paq8px in Data Compression
It's crazy, isn't it? I'll need to refresh the building instructions a bit anyway.
1664 replies | 475486 view(s)
• 16th August 2019, 19:46
moisesmcardona replied to a thread paq8px in Data Compression
1664 replies | 475486 view(s)
• 16th August 2019, 19:46
jethro replied to a thread Zstandard in Data Compression
How to use dictionary with ZSTD? I tried: zstd BJ_all_Corr.csv -D dict.1 -o dict.zstd zstd: cannot use BJ_all_Corr.csv as an input file and dictionary How do i tell zstd which file is the trained dictonary (dict.1 here)?
335 replies | 112042 view(s)
• 16th August 2019, 19:42
moisesmcardona replied to a thread paq8px in Data Compression
Hmm. If you were able to compile it with VS 2019 then I should as well... I'm using Visual Studio 2019 Community Edition too. Has the solution file changed, perhaps? Yes, it fails at the linker stage. Windows 10. Latest SDK installed.
1664 replies | 475486 view(s)
• 16th August 2019, 18:44
Gotty replied to a thread paq8px in Data Compression
I tested (compiled and run) successfully on/with: Windows 10: Visual Studio Community Edition 2017 15.9.12 Windows 10: MinGW-w64 x86_64-7.2.0-win32-seh-rt_v5-rev1 Windows 8.1: Visual Studio Community Edition 2019 16.1.5 Windows 7: MinGW-w64 x86_64-7.2.0-win32-seh-rt_v5-rev1 Lubuntu 19.04: GCC 8.3.0 Could you try to include the following: #include <shellapi.h> CommandLineToArgvW is defined in that header, but I did not need to include it on my system, and I think it's not needed. It looks like you have successfully compiled the source, and got stuck at linking. Which Windows and which Visual Studio are you using? (Remark: command line parsing and file operations work everywhere but character display works properly only in Linux and Windows 10. On Windows 10 you'll need to set a "good" console font (like Lucida Console) if you need to display filenames in exotic languages. The default raster font does not have enough glyphs.) EDIT: I've got it! In Visual Studio: In Configuration properties -> Linker -> Input -> Additional Dependencies You have: zlibstat.lib You need to have: zlibstat.lib;shell32.lib
1664 replies | 475486 view(s)
• 16th August 2019, 17:12
moisesmcardona replied to a thread paq8px in Data Compression
Is there any changes that needs to be done to compile it in Visual Studio? It fails with "unresolved external symbol __imp_CommandLineToArgvW". I have the Windows SDK installed and use the paq8px visual studio solution files.
1664 replies | 475486 view(s)
• 16th August 2019, 11:06
Darek replied to a thread Paq8pxd dict in Data Compression
@kaitz -> According to v67 version - for option -s15 and for some files (G.EXE, H.EXE from my filetest and AcroRd32.exe, FlashMX.pdf, MSO97.DLL from Maximum Compression) program didn't finish the compression process and exit w/o crash... Question about WAV model -could you implement also newest model from paq8px v179/v181 version? And the maybe a tip - if you want to optimize ebnwik scores then maybe it's reasoneble to look at Paq8pxd_v48_bwt4 version - it got 16'183'xxx bytes for -s8 option on the model from paq8pxd v48 version. And scores for paq8pxd v68 version -> 0.29% of improvement = -30KB! Very nice gain, especially for the biggest files. That means paq8pxd v68 got total score for my testset better than cmix v18 :) - 14KB less!
641 replies | 253885 view(s)
• 16th August 2019, 03:37
Gotty replied to a thread paq8px in Data Compression
- support for unicode file names - printing minimal feedback on screen when output is redirected (file names, file sizes, progress bar) Only user interface changes - no change in compression.
1664 replies | 475486 view(s)
• 16th August 2019, 03:37
Gotty replied to a thread paq8px in Data Compression
1664 replies | 475486 view(s)
• 16th August 2019, 03:31
Gotty replied to a thread paq8px in Data Compression
Unfortunately it's not enough. While all i/o on Linux works perfectly with char* utf8 strings, on Windows even the command line arguments are lost when the input is not compatible with the current codepage (current = before the exe starts executing). You try to give a cyrillic filename on a non-cyrillic locale, and windows will lose it as it can't convert it. You'll need to re-acquire the command line arguments with the CommandLineToArgvW function in wchar_t* (utf16) and convert it to utf8. The next problem comes when you try to work with functions such as fopen or stat: they can't handle multibyte (utf8 ) or wide characters (utf16) on Windows. You'll need to use _wfopen and _wstat (which are not available on Linux). Since they work with wide characters (utf16) only, you'll need to do some utf8 conversion as well. Here comes paq8px_v181fix1 - it contains all of the above.
1664 replies | 475486 view(s)
• 16th August 2019, 03:06
Gotty replied to a thread paq8px in Data Compression
I'm afraid that is not possible at the moment. There are a lot of tunable parameters in the source file not just in form of numbers, but in for of algorithmic choices. For performance reasons they are all hardcoded. To be able to to tune any of them you will need to grab the source file, change it, compile, compress some files and see if compression gets better... This is how we all do it. Don't forget that the jpeg compression depends not only on the JpegModel but some more: the NormalModel, MatchModel the Mixer and the SSE stage.
1664 replies | 475486 view(s)
• 16th August 2019, 00:37
moisesmcardona replied to a thread paq8px in Data Compression
1664 replies | 475486 view(s)
• 16th August 2019, 00:12
LucaBiondi replied to a thread paq8px in Data Compression
Ok thanks i found ZLIB1.DLL But when i run Paq8px167ContextualMemory i get VCRUNTIME140_1.dll not found... Luca https://sqlserverperformace.blogspot.com/
1664 replies | 475486 view(s)
• 15th August 2019, 23:15
> lstm_(25, 3, 20, 0.05) What implementation do you use? lstm-compress, nncp, third-party library (which?), your implementation, ...? Do you use it as a predictor or a mixer? Only 25 nodes? It's fast but scarce.
641 replies | 253885 view(s)
• 15th August 2019, 23:13
Tested -time_steps and -seed on the following 3 files to check if and how much they gain. 1.BMP 333.003 (a) = Darek's min nncp 332.617 (b) = (a) + -time_steps 17 331.765 (c) = (b) + -hidden_size 192 331.746 (d) = (c) + -adam_beta2 0.999 No gain was found by changing -seed. O.APR 6.052 (a) = Darek's min nncp 6.045 (b) = (a) + -seed 0 No gain was found by changing -time_steps. R.DOC 34.802 (a) = Darek's min nncp 34.692 (b) = (a) + -time_steps 17 34.659 (c) = (b) + -seed 1
83 replies | 7925 view(s)
• 15th August 2019, 22:14
moisesmcardona replied to a thread paq8px in Data Compression
Hmm, that's interesting. Try using the shared build. That one has the zlib1.dll file.
1664 replies | 475486 view(s)
• 15th August 2019, 22:05
LucaBiondi replied to a thread paq8px in Data Compression
I used the static build
1664 replies | 475486 view(s)
• 15th August 2019, 22:00
LucaBiondi replied to a thread paq8px in Data Compression
i used the static build
1664 replies | 475486 view(s)
• 15th August 2019, 18:33
moisesmcardona replied to a thread paq8px in Data Compression
Did you used the static or shared build? Try the second zip. I forgot to use the static zlib library in the first one.
1664 replies | 475486 view(s)
• 15th August 2019, 18:27
LucaBiondi replied to a thread paq8px in Data Compression
Hi moisescardona, I would try your versione but if i execute it with this sintax: Paq8px167ContextualMemory @txtlist.txt -9 -v -log paq8px__167cm.txt i obtain this error: Zlib1.dll not found and then VCRUNTIME140_1 .. Where i can find this file? thanks Luca
1664 replies | 475486 view(s)
• 15th August 2019, 17:46
kaitz replied to a thread Paq8pxd dict in Data Compression
paq8pxd_v68 update im8Mode,im24model,exemodel from paq8px_179 compressed stream /MZIP,EOL,ZLIB fail/ linearPredictionModel from paq8px_179 add charModel into wordModel paq8px_179 If i add ppm model to v68 and lstm model (lstm_(25, 3, 20, 0.05)) then enwik is about 16243xxx bytes with option -8
641 replies | 253885 view(s)
• 15th August 2019, 15:11
0 replies | 203 view(s)
• 15th August 2019, 14:36
Interesting, I had heard about the P30 having some unique features but not that.
4 replies | 331 view(s)
• 15th August 2019, 14:30
That's a good point. I think a good use for libdeflate and zopfli is precompressing static files, like for Static Site Generators like Jekyll, and SVG images, etc.
6 replies | 246 view(s)
• 15th August 2019, 14:21
Well, it looks like he ultimately got Cloudflare's fork working: https://community.centminmod.com/threads/enable-cloudflare-zlib-performance-library-by-default-for-nginx-zlib.14084/
6 replies | 246 view(s)
• 15th August 2019, 14:10
Yes, libdeflate is very good. The issue with those projects is that none are drop-in replacements for zlib, so they have very limited use. libdeflate doesn't do streaming. zlib-ng was declared by its contributors as not production ready last time I checked. Cloudflare doesn't document their forks and provides no instructions, so they usually don't work. The Intel fork didn't work either when the Centminmod project tried to use it (Cloudflare wouldn't build either).
6 replies | 246 view(s)
More Activity