Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • pklat's Avatar
    Today, 08:52
    anyway, lossless seem to work great. its fast and has best ratio of all. and the patents expired in 2017. btw, what about 'solid' images compression, as someone already asked, can/will jpegxl/webp use it?
    177 replies | 42736 view(s)
  • Gotty's Avatar
    Today, 08:01
    I don't even understand the compressing part. :-) Yes, please explain how decompression works. For the decompression I expect a less then 9-digit input altogether. If the information you need for the decompression is more then 9 digits, then...
    18 replies | 247 view(s)
  • Gotty's Avatar
    Today, 07:57
    :confused: Now which one? Please explain the key selection process. How do you choose a key? When the key is stored together with the compressed output, then key + compressed output is larger than input, isn't it? In your above example your input is 9 digits and your key is 10 digits and your encoded result is 9 digits. I can't see how it is a compression.
    18 replies | 247 view(s)
  • compgt's Avatar
    Today, 07:09
    My money was *legitimate* from Hollywood music and movies, and my science and technologies, starting up the tech companies. I co-own Apple, Microsoft, Google and Facebook etc. but were taken from me. I want to own now these tech companies. *** These smartphones are my "standards" too. I standardized their specs and technologies for the different companies, avoiding patents of one another. For the most part, these specs and features are already my "better refinements" of the underlying technologies, especially cameras, distributed according to my timeline of technological improvements. The Apple technologies are mine, specifically, and Samsung technologies too. This i did in the 1970s. The Chinese smartphones companies were planned by or with me too. So i co-own Huawei, Oppo, Xiaomi, and Realme. When the top companies were stolen from me, they would offer me to create a new company, complete with a portfolio or lineup of its technologies. This is what happened to me in the end, creating and creating new companies after one was stolen from me. So i knew i was designing the "timeline" of technologies, not only in smartphones but in most other gadgets and appliances as well. Yet now i haven't enjoyed the more-pricey and more high-tech smartphones or digital cameras, because i don't have money, and when in fact i invented their technologies?! I need my billion$ from these smartphones companies. It's too hard for you to believe because you were conditioned too much by the news and media press releases about these tech giants' making and history. My pronouncements introduce a cognitive dissonance in your minds and belief systems. You must take my statements here as *truth*, even if their purpose is also to make money for me, not just to reveal the true history of modern computing. But of course, you have the right to your own beliefs, i respect that. I ask $125bn from Microsoft, $100bn from Apple, $110bn from Google, $100bn from Facebook, $25bn from IBM, $25bn from Intel, $20bn from AMD, $75bn from Samsung, etc, but a few tens of billion$ will suffice. I remember Jeff Bezos acknowledging my co-ownership of the planned Amazon too. I was the planner/moderator of all these tech giants' making and starting up, that i will be billionaire co-founder too, i said. Please help me inform these tech giants. #PayTamayo
    30 replies | 1524 view(s)
  • Jyrki Alakuijala's Avatar
    Today, 04:21
    Some governmental authorities and archiving organizations consider JBIG2 not suitable for use. """Due to the problems that might occur during compression some experts recommend not to use the algorithm at all. In particular the German federal authority BSI (Bundesamt für die Sicherheit in der Informationstechnik) revised the RESISCAN guideline accordingly. Although JBIG2 is not mentioned explicitly therein it forbids pattern matching / replacement and soft pattern matching algorithms. This implies that JBIG2 shall not be used neither lossless nor lossy. Also the Swiss KOST (Koordinationsstelle für die dauerhafte Archivierung elektronischer Unterlagen) recommends not to use JBIG2 anymore.""" http://blog.pdf-tools.com/2015/07/is-jbig2-soon-banned.html
    177 replies | 42736 view(s)
  • Jyrki Alakuijala's Avatar
    Today, 04:11
    These examples don't look relevant or correct in the light of what is already possible. I'd be more interesting in new technology that makes images look crispy and natural. The cheapest unlimited internet connection (based on 4G) that I can buy in my area runs at 10-30 mbps and is already able to deliver a decent YouTube or Netflix experience running on VP9, even two streams at once.
    24 replies | 1256 view(s)
  • Hcodec's Avatar
    Today, 01:41
    Do you understand the decode or do you want me to post the video. It is a simple process.
    18 replies | 247 view(s)
  • Hcodec's Avatar
    Today, 01:17
    Here is a video using the 9 numbers. Where the entropy is changed by using the transform. It uses the same transform as the example above. Please note the toggle bit is needed to note if the stream starts with a odd or even number. The stream is also changed from high to low entropy. In the next video I will show how the decode works. It is a simple process. perhaps you have already figured it out. If you reverse the of (0,0,1,2,5,6,7,3,5) you end up with the original of (8,1,3,4,6,2,7,9,5). Because of the key of (3,4,5,6,7,8,9,0,1,2) any number that starts even will end in 0, in 4, or less than 4 steps, and any number that starts in an odd number will end in a 1, in less than for steps. Using another method it is possible to change the entropy even more where you can use a huffman code. You do not change the key, (however you can use various keys multi key encryption but that is another topic) you always use the same key. The key though, becomes part of the plaintext or in this case part of the stream you are compressing. The new permutation is very subjective. Depending on the key you can produce all permutation of length n counting in base n or filter the results so the permutations only progress through specific numbers in this case if it is an even number the permutation output will always start with (9,7,5,3,1) if it is even (8,6,4,2,0) Then you subtract a 0 or subtract a 1. Inversely you add a 1 or a 0. Video showing transform of number from high entropy to low.
    18 replies | 247 view(s)
  • Gotty's Avatar
    Today, 00:32
    Thank you for sharing your ideas, and for the video link. ​ If you change the key during compression (based on the input?): how would you know which was your compression-key during decompression? Can you decompress the result? In your earlier post you mentioned signal bits and padding. How do you know looking at the compressed result which is a signal bit, a padding and which is data?
    18 replies | 247 view(s)
  • Hcodec's Avatar
    Yesterday, 23:59
    @Gotty You will understand once you see the transform. I understand what your concerns are and I am glad that you wrote them. This method has to stand up to the strictest and harshest criticism and if it does not then it will go by the way of all snake oil products. I certainly understand that many have come before me claiming the impossible, so I do not expect anyone to give this the time of day and am thankful for your comments. I reserve the right to fail at this horribly. However this transform has many applications aside from what I am trying to accomplish with it. The transform takes can take a block of N length or a bit stream of any length and depending on the Key you use outputs pseudo random permutations mapped directly from your bitstream or block, in base N. I made this as an unbreakable encryption, but realized it has other uses. Here is a simple example and then I will post links to video for how it can be used as compression. The key in this example is {1,0} in blue. The (1,1) is a starting seed. 1x 0 1 0 1 step 1 above bits in black is how you build your permutation from your original seed do {1,1} you look at your seed and then find the coresponding position or index in your key. Here we are looking for a 1 from the seed, we see that there is in position or n-1 position 0 (index 0) a 1 so we place an x in the key and a 0 beside the position of the seed. Step 2 below. Next we are looking for the next 1 in the seed to the first 1 in the key not counting the x's that have already been marked off. We see that it is in position N-1 or position 1 so our next permutation is complete. Our first was {1,1} and our next subset is (0,1). 1x 0 1x0 1 1 Step 3 below is move the key for a recursive loop but we use the seed (0,1), the sub set of the seed {1,1}. Now we are looking for the first 0 in black to the first position of the 0 in blue and we see it is in the second index or position 2, But we are using n-1 so it is position 1. 1x1 0 0x 1x0 1 1 1 Step 4 below, we are looking for the position in blue of the black 1 and we see it is in the first index or position 1 which translates to 0. We now have our next permutation which is a subset of or set A which we called a seed but from here on out it will be a bitstream or a number stream. We do the same as in the previous steps and move our key over. Please note I am using the word key in this example, but when we start using this as a compression scheme our key will change in order and length and this will allow us to filter what we want out output to be so we do not produce all permutations just a select set that we can manipulate easily to compress. We now repeat the steps as before for our next iteration or permutation. Please note all permutation are built from our original seed/random stream. In this case our next permutation is 1x1x1x 0 0x0x 1x0 1 1 1 1 0 1 Sorry this took so much time and space but I wanted to make sure you understood how to do a very very simple permutation before moving on to compression. Because compression uses this scheme. Here is the video link for the process above.
    18 replies | 247 view(s)
  • Gotty's Avatar
    Yesterday, 21:05
    ​OK. Good. But your example contradicts it. Let's count then. I think, it's 9*log2(10)=29.89735 bits. How did you get your results? If it is a random number it can be anything between "000000000" and "999999999", right? Emphasis on: "anything". How many numbers are there? 1'000'000'000. Can you compress any of them to less? Hohoo, wait-wait. If you have 7 digits left (which covers only the one hundredth of the original 9-digit range), there are around 100 original 9-digit numbers that end up being the same 7-digit number after the transformation. So you simply cannot reverse the process and find out which was the original 9-digit for a specific 7-digit number. Try transforming all the 9-digit numbers, enumerate all the results, and tell us, how many "(1,2,5,6,7,3,5)" did you have. :-) See the problem?
    18 replies | 247 view(s)
  • Gotty's Avatar
    Yesterday, 20:37
    This topic should be moved to "Random Rompression".
    18 replies | 247 view(s)
  • cssignet's Avatar
    Yesterday, 20:20
    ​perhaps it worth to compile it yourself (and eventually adapt the codec for your usage context) so you would have huge speed differences. on my very outdated low-average specs: FX-4100 @ 3.6 Ghz 8 Go RAM Windows 7 64-bit lossless: cjpegXL_jxl.exe -q 100 chair.png <-- the binary you linked Read 180077 bytes (512x512, 5.496 bpp, 22.3 MP/s) Encoding , 4 threads. Compressed to 127255 bytes (3.884 bpp). 512 x 512, 0.02 MP/s , 1 reps, 4 threads. Kernel Time = 2.308 = 20% User Time = 37.845 = 334% Process Time = 40.154 = 354% Virtual Memory = 82 MB Global Time = 11.322 = 100% Physical Memory = 78 MB out: 127 255 bytes 0.00000000 0.000000 cjpegxl.exe -q 100 chair.png <-- another one Read 180077 bytes (512x512, 5.496 bpp, 21.9 MP/s) Encoding , 4 threads. Compressed to 127221 bytes (3.882 bpp). 512 x 512, 0.06 MP/s , 1 reps, 4 threads. Kernel Time = 0.764 = 18% User Time = 12.823 = 307% Process Time = 13.587 = 325% Virtual Memory = 48 MB Global Time = 4.176 = 100% Physical Memory = 43 MB out: 127 221 bytes 0.00000000 0.000000 wjpegxl.exe -lossless chair.png <-- my experimental imp., for web usage 4 threads (preprocessing+tweaked wombat) Kernel Time = 0.234 = 35% User Time = 1.700 = 260% Process Time = 1.934 = 296% Virtual Memory = 46 MB Global Time = 0.652 = 100% Physical Memory = 39 MB out: 125 138 bytes 0.00000000 0.000000 lossy: cjpegXL_jxl.exe -q 99 chair.png Read 180077 bytes (512x512, 5.496 bpp, 21.9 MP/s) Encoding , 4 threads. Compressed to 81201 bytes (2.478 bpp). 512 x 512, 0.41 MP/s , 1 reps, 4 threads. Kernel Time = 0.140 = 19% User Time = 1.482 = 203% Process Time = 1.622 = 222% Virtual Memory = 38 MB Global Time = 0.728 = 100% Physical Memory = 35 MB out: 81 201 bytes 0.00003091 0.401421 cjpegxl.exe -q 99 chair.png Read 180077 bytes (512x512, 5.496 bpp, 21.8 MP/s) Encoding , 4 threads. Compressed to 79990 bytes (2.441 bpp). 512 x 512, 0.69 MP/s , 1 reps, 4 threads. Kernel Time = 0.031 = 6% User Time = 0.826 = 172% Process Time = 0.858 = 178% Virtual Memory = 32 MB Global Time = 0.480 = 100% Physical Memory = 30 MB out: 79 990 bytes 0.00003197 0.379512​wjpegxl.exe -lossy=99 chair.png 4 threads (preprocessing+tweaked wombat) Kernel Time = 0.046 = 21% User Time = 0.296 = 134% Process Time = 0.343 = 156% Virtual Memory = 30 MB Global Time = 0.220 = 100% Physical Memory = 29 MB out: 75 169 bytes 0.00003267 0.399000 what about providing pre-compiled binaries for Windows in the repository? — maybe it would be useful for some users and should let more feedback?
    39 replies | 3481 view(s)
  • Hcodec's Avatar
    Yesterday, 19:33
    Good, I'll move our talk over to the main forum.
    18 replies | 247 view(s)
  • compgt's Avatar
    Yesterday, 19:27
    I don't know why but people here ask for a compressor/decompressor outright. If you think it is fine to divulge your compression algorithm (which you say is random data compressor or a new pseudo random number generator), i think you will find here eager listeners.
    18 replies | 247 view(s)
  • Hcodec's Avatar
    Yesterday, 19:07
    What i meant was i never tried a recursive scheme. I've held on to this compression scheme for eight years. A few months ago i figured out what this does. If you mean a decoder as in, once compressed can you restore the file, stream, bits to their original size in a lossless order, sure i did that 8 years ago. It is a simple bijective inverse. I'm more nervous of this as a stream or block cipher. Even though R.S.A. and Elliptic curve have saturated the learning centers.
    18 replies | 247 view(s)
  • compgt's Avatar
    Yesterday, 18:57
    Yes, eventually your recursive function might stop compressing and start expanding. In 8 years of thinking data compression, i believe you should had tried coding your ideas to once and for all see if your algorithm compresses. I had 2 years of intensive random data compression thinking (2006-2007), and i still think i already solved it but without a decoder there's no proof of that. We must not be afraid of actually coding our compression ideas because it will make us face the truth if our algorithm is working or not. But for the most part, experts and academics state that random data compression is *not possible*, by the simplest argument that is the Pigeonhole Principle and other clever mathematics.
    18 replies | 247 view(s)
  • pklat's Avatar
    Yesterday, 17:59
    I've read about it just yesterday, someone complaining on forum. presumably its OCR problem so better AI, spellcheck, etc could ease it. I plan to use lossless anyway. edit: djvu uses jbig2, I understand now how it works, after reading the wiki.
    177 replies | 42736 view(s)
  • choochootrain's Avatar
    Yesterday, 17:12
    Golem took a quick look at x264/x265/av1/vvc at default settings for 4K content: https://www.golem.de/news/h-266-alias-vvc-schoen-langsam-zukunft-ungewiss-2007-149618.html The original stills can be obtained from their gallery. SSIM: (original.png) 1 vvc_3mbit.png 0.832388 av1_3mbit.png 0.835461 x265_3mbit.png 0.802672 x264_3mbit.png 0.746433 ------------------------------------------- (original.png) 1 vvc_12mbit.png 0.85819 av1_12mbit.png 0.853742 x265_12mbit.png 0.848555 x264_12mbit.png 0.832085 DSSIM: (original.png) 0 vvc_3mbit.png 0.0838061 av1_3mbit.png 0.0822697 x265_3mbit.png 0.098664 x264_3mbit.png 0.126783 ------------------------------------------- (original.png) 0 vvc_12mbit.png 0.0709049 av1_12mbit.png 0.0731289 x265_12mbit.png 0.0757223 x264_12mbit.png 0.0839575 PSNR: (original.png) inf vvc_3mbit.png 32.1021 av1_3mbit.png 32.3837 x265_3mbit.png 29.7331 x264_3mbit.png 25.0805 ------------------------------------------- (original.png) inf vvc_12mbit.png 35.1929 av1_12mbit.png 34.8772 x265_12mbit.png 34.5414 x264_12mbit.png 32.7967 AV1 seems to beat VVC at 3mbit, VVC seems to beat AV1 at 12mbit in this scene. Links to originals: https://scr3.golem.de/screenshots/2007/VVC_Test/01tearshortavc-compress.png ​https://scr3.golem.de/screenshots/2007/VVC_Test/02tearshorthevc-compress.png https://scr3.golem.de/screenshots/2007/VVC_Test/03tearshortav1-compress.png https://scr3.golem.de/screenshots/2007/VVC_Test/04tearshortvvc-compress.png https://scr3.golem.de/screenshots/2007/VVC_Test/05tearshortoriginal-compress.png https://scr3.golem.de/screenshots/2007/VVC_Test/06_tears_short_kleinavc-compress.png https://scr3.golem.de/screenshots/2007/VVC_Test/07_tears_short_kleinhevc-compress.png https://scr3.golem.de/screenshots/2007/VVC_Test/08_tears_short_kleinav1-compress.png https://scr3.golem.de/screenshots/2007/VVC_Test/09_tears_short_kleinvvc-compress.png
    24 replies | 1256 view(s)
  • Hcodec's Avatar
    Yesterday, 17:06
    @compgt I've never tried, but i would think it would eventually hit a wall because of the signal bits that have to be stored for the decode scheme. Good question, thanks....I'll try it out.
    18 replies | 247 view(s)
  • lz77's Avatar
    Yesterday, 15:36
    To all: what is your forecast for the best Rapid (un)compression (total < 40/2.5=16 sec.) for the file TS40.txt: 115 Mb, 110 Mb, 105 Mb... ?
    36 replies | 2071 view(s)
  • Jarek's Avatar
    Yesterday, 12:00
    There are three - from https://www.ibc.org/trends/2020-crunch-time-for-codecs/5569.article
    24 replies | 1256 view(s)
  • compgt's Avatar
    Yesterday, 11:26
    Is your algorithm "recursive" or "perpetual" compression, i.e., you can apply the same algorithm to the output again and again and still achieves compaction?
    18 replies | 247 view(s)
  • Hcodec's Avatar
    Yesterday, 11:15
    Of course one of the first laws i studied. So i found a way to transform the elements of set A into a subset of highly compressible numbers of lower entropy, the inverse takes less steps (signal bits) to reconstruct than the original size. Let's take a set of nine random numbers, unique as to not waste time with a huffman tree. {8,1,3,4,6,2,7,9,5,} 813462795 to binary is 30 bits. Entropy is. 28.65982114 bits. After a 4 step transform your number becomes. (0,0,1,2,5,6,7,3,5) or (1,2,5,6,7,3,5) but since i have not found a way to make the sets variable length and not loose integrity I'll add padding to make the set 8 digits (0,1,2,5,6,7,3,5) which is 21 bit plus 2 signal bits plus 2.33 bits for padding a 0 is 25.33 bits total or 2.814777778 bits per number. I would like to explain the transform, which is a great encryption also but probably move this out of off topic. I am not a programmer, this was a simple hand cipher compression problem.
    18 replies | 247 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 09:03
    If the documents are in the main role and not compression and it is a personal project -- use the same compression that everyone else is using. Sorry if I missed sarcasm here... You are aware of the substitution problem in lossy jbig2?
    177 replies | 42736 view(s)
  • pklat's Avatar
    Yesterday, 08:55
    I wanted to unpack old .pdf files, mostly scanned text. It seems such a waste creating pdf out of (scanned) images without proper OCR/vectorizing. jbig2 has an interesting lossy option. I googled it a bit, seems to do ocr or something. One day, I'll try to convert 'real' (vector) pdfs to svg, it seems much better to me. I've also unpacked cbz/cbr and such, dislike them. Presumably 1-bit images have other applications, too.
    177 replies | 42736 view(s)
  • Gotty's Avatar
    Yesterday, 08:08
    Are you familiar with the pigenhole principle and the counting argument? #1. #2
    18 replies | 247 view(s)
  • Shelwien's Avatar
    Yesterday, 06:43
    Btw, "mcm -store" also can be used as external preprocessor.
    36 replies | 2071 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 06:17
    This is known. AVIF is currently better at low image quality. JPEG XL is currently better at medium and high image qualities, including the range of image quality used in the internet and the cameras today. It is an interesting question if low image quality or medium to high image quality will matter more in the future. It is a complex dynamic of internet/mobile speeds developing, display resolution increasing, image formats driving the cost of quality lower, HDR, impact of image quality on commerce, progressive rendering, and other user related dynamics.
    27 replies | 1605 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 06:03
    Try using cjpegxl as follows: $ cjpegxl input.png output.jxl Try using 'cjpegxl --distance 1.5 input.png output.jxl' for slightly worse quality Don't worry if cjpegxl runs fast, likely about 1000x faster than what you are experiencing. If you want slightly (10 % or so) higher quality, use the --speed kitten, still 100x faster than what you use.
    27 replies | 1605 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 05:38
    Out of curiosity: what do you need it for?
    177 replies | 42736 view(s)
  • byronknoll's Avatar
    Yesterday, 04:41
    byronknoll replied to a thread paq8px in Data Compression
    Welcome back mpais! The dictionary preprocessor is something I wrote based on the WRT algorithm - it is not based on MCM. It uses the dictionary from phda.
    1958 replies | 548120 view(s)
  • Shelwien's Avatar
    Yesterday, 04:16
    @Jarek: Sorry, but its not really a crash... Hosting company is experimenting with mod_security rules to block exploits. Not sure how to deal with it - VBulletin engine is not very safe, so in some cases its actually helpful.
    14 replies | 640 view(s)
  • Jarek's Avatar
    Yesterday, 03:43
    I have updated https://arxiv.org/pdf/2004.03391 with perceptual evaluation, also to be combined with this decorrelation (for agreement with 3 nearly independent Laplace distributions) to automatically optimize quantization coefficients. So there is a separate basis P for perceptual evaluation e.g. YCrCb. In this basis we define d=(d1, d2, d3) weights for distortion penalty, e.g. larger for Y, smaller for Cr, Cb. There is also transform basis O into actually encoded channels (preferably decorrelated) with q=(q1, q2, q3) quantization coefficients. This way perceptual evaluation (distortion) becomes: D = |diag(q) O P^T diag(p)| Frobenius norm. Entropy (rate) is this H = h(X O^T) – lg(q1 q2 q3) + const bits/pixel. If P=O (rotations): perceptual evaluation is defined for decorrelation axes, then distortion D is minimized for quantization coefficients: (q1,q2,q3) = (1/d1,1/d2,1/d3) times constant choosing rate-distortion tradeoff. For general perceptual evaluation (P!=O) we can minimize rate H under constraint of fixed distortion D. ps. I didn't use bold 'H' because it literally crashes the server :D ("Internal Server Error", also italic, underline)
    14 replies | 640 view(s)
  • Shelwien's Avatar
    Yesterday, 02:54
    Any comments on the site design? https://globalcompetition.compression.ru/ How do you think we can improve it to increase participation?
    40 replies | 2984 view(s)
  • DZgas's Avatar
    Yesterday, 02:07
    DZgas replied to a thread JPEG XL vs. AVIF in Data Compression
    Another example for you, size files is 14 kb. Coding time 43 sec JpegXL 39 sec AVIF JepgXL save the few things that AVIF erased if viewed from far, comparing look of the original. But AVIF save so many forms and it looks better, JpegXL looks bad.
    27 replies | 1605 view(s)
  • Hcodec's Avatar
    Yesterday, 01:46
    Thanks Gotty, It was quite the journey! living in another country i was only able to do so much. I am trying to figure out what to do with what I learned. What do you mean by pseudo random data exactly? I am referring to the Kolmogorov complexity or entropy when talking about the complexity of a bit stream. We were on a quest night and day for eight years to come up with a way to compress random data. It was quite the learning experience for me. I invented many concepts for the first time only to discover that others had made the same discovery many years before. I invented variable length coding, 3d cartisan point encryption, fractal geometry, and many other off the wall ideas only to find a way to compress random data.... only to find others invented the same years before. I came up with one idea that I had never seen since or after that showed the most hope and that is what led me here. I came up with a way to change the number and place value of digits subjectively in a pseudo random permutation order, that allows for an easy inverse. A way to take a random stream of any length of a high entropy and change it to a very low entropy. It is a new pseudo random generator. That allows for compression. I hope I can explain it more. Yes! I agree all data is pseudo random unless the source is based on some generator that defies to be quantified like radiation noise (hardware random number generators).
    18 replies | 247 view(s)
  • DZgas's Avatar
    Yesterday, 01:26
    DZgas replied to a thread JPEG XL vs. AVIF in Data Compression
    Well, all codecs can. But with a low bitrate JpegXL cannot. I just found this problem. Average bitrate - JpegXL compress is good, same another codec. Low bitrates - the quality of AVIF is undeniable.
    27 replies | 1605 view(s)
  • DZgas's Avatar
    13th July 2020, 23:54
    All these companies and services are involved. AV1 has too much support.
    24 replies | 1256 view(s)
  • Darek's Avatar
    13th July 2020, 23:53
    Darek replied to a thread paq8px in Data Compression
    As I remember when I've found first paq instances (I don't remember exactly when - 2004? 2006?) but it was paq4 and then I've found paq6. During this time my laptop compress my testbed about 3-4h... Then I was though that such compress time is way too slow but now the fastests paq versions runs on my laptop about 60-70min... probably on Sportman machine it would be about 40-50min. I think, due to AMD/Intel battle CPU IPC will improve much during next 5 years and then "standard" paq will be really reasonable option. And now LSTM option compress my testset in 6hours (one instance time) - new era started - who knows when it would be reasonable to use but I'm sure it will.. :)
    1958 replies | 548120 view(s)
  • Gotty's Avatar
    13th July 2020, 23:51
    Gotty replied to a thread paq8px in Data Compression
    Sorry for the misleading information. I checked only that the LSTM model does not use it and I was in the belief somehow that the LSTM model (being a neural network) replaces the paq mixer. Which is not true. See the answer form mpais above.
    1958 replies | 548120 view(s)
  • pklat's Avatar
    13th July 2020, 23:46
    I tried webp lossless on black&white image (scanned text mostly), but ccitt has better ratio. Is there some modern alternative to ccitt? cmix compressed it but its just too slow and RAM consuming. edit: jbig2 seems to be better, even than cmix. and its fast!
    177 replies | 42736 view(s)
  • SolidComp's Avatar
    13th July 2020, 22:25
    These are companies involved? I don't see any TV makers or AV equipment makers. We'll see how it goes.
    24 replies | 1256 view(s)
  • Darek's Avatar
    13th July 2020, 21:50
    Darek replied to a thread paq8px in Data Compression
    Hmm, from my tetst shows that there is an difference between "la" and "l" option. Generally small (average 30-70 bytes but it exis, but for M.DBF difference is about 550 bytes). Are you sure that this option is ignored? For 1.BMP, C.TIF, M.DBF, and from R to X.DOC there are some gains using "la" vs. "l".
    1958 replies | 548120 view(s)
  • mpais's Avatar
    13th July 2020, 20:58
    mpais replied to a thread paq8px in Data Compression
    Thanks for testing Darek. As for being faster than cmix, that's to be expected (cmix is basically paq8px(d) + paq8hp + mod_ppmd + LSTM + floating point mixer), but don't forget that porting these changes to cmix will also make it slightly faster, or at least Byron may choose to increase the LSTM network size and keep the same speed. Imho, even if the results are interesting, I still don't really see much use for the current crop of ML network architectures for data compression, they're all too slow (though progress is being made). Seeing as how you guys have been busy cleaning up the code and documenting things, I guess I'll leave a few more comments on the LSTM model. - We're using RMSNorm instead of LayerNorm (as in cmix), since it's faster and in testing showed better results. - The Softmax is done as in cmix also, using the direct application of the formula, which should by all means be numerically unstable due to the limited range of valid exponents in floating point single precision, but the usual correct implementation (finding the maximum value first, and using exp(v-max) instead of exp(v)) actually gave worse results in my (admittedly limited) testing. - Decoupling the learning rate for the forget gate and tunning it separately showed some promise on smaller files, but I couldn't find a decay strategy that was stable enough to give better overall results. - The code can still be made faster, as since we don't bother with proper memory alignment, we're using AVX2 unaligned loads and stores. - For reproducibility, an all fixed-point processing pipeline would be much better, since working with floating point is just a mess. Regarding further improvements, I must say that I'm a bit out of the loop as to what's new in paq8pxd and cmix. I see that paq8pxd now as an option "-w" that seems to be designed for enwik*, and cmix seems to have a dictionary preprocessor based on MCM(?). The mod_ppmd changes you mention in paq8pxd seem to just be a tweak of the maximum order and for allowing higher memory usage, am I missing something? If anything, the LSTM model showed that we really need to improve x86/64 executable compression, since that is where we get some of the best and most consistent gains. This would suggest that there's still a lot of low hanging fruit that probably needs a different approach which we've been missing. Also, one oft-forgotten advantage of cmix is that it has a floating point mixer. So even though its paq sub-models just output the same 12-bit predictions (then converted to FP), at least the LSTM predictions use full 32-bit single precision FP, as do the mixer weights, and perhaps more importantly even, so do the individual learning rates. @Gotty The LSTM model is just another model outputting predictions to the PAQ mixer, it doesn't replace it, so the "-a" option still has the same effect overall. It just doesn't affect the learning rate of the LSTM, since for that we have the decay strategy and the optimizer.
    1958 replies | 548120 view(s)
  • Gotty's Avatar
    13th July 2020, 20:25
    Welcome, welcome! Sorry to hear your story. Random data compression is an everyday topic in the encode.su forum. Look under the https://encode.su/forums/19-Random-Compression subforum. What do you mean by pseudo random data exactly? And especially why would you like to compress such data? For me: All random data is pseudo random since all data were created by some process.
    18 replies | 247 view(s)
  • Gotty's Avatar
    13th July 2020, 19:27
    Gotty replied to a thread paq8px in Data Compression
    Oh, that's a lot to test. Hint: the "-a" option is used only for the original paq-Mixer, it has no effect on LSTM, so when using "-L" the option "-a" is ignored. Edit: actually not true: see the reply from mpais below.
    1958 replies | 548120 view(s)
  • spaceship9876's Avatar
    13th July 2020, 18:33
    EVC was ratified ~3 months ago. They haven't released an encoder or decoder to the public yet though. They also haven't announced licensing costs or terms for the main profile either. That codec may have been created not for mass adoption but to pressure the companies involved in VVC to use sensible licensing costs and terms.
    24 replies | 1256 view(s)
  • suryakandau@yahoo.co.id's Avatar
    13th July 2020, 17:38
    Fp8sk16 - tweak image24bit model - still the fastest paq8 version here is the source and the binary file
    34 replies | 1407 view(s)
  • Jyrki Alakuijala's Avatar
    13th July 2020, 16:48
    Perhaps it is just me, but I see differences.
    27 replies | 1605 view(s)
  • Hcodec's Avatar
    13th July 2020, 16:38
    Great to be here. Long short story...... if Possible. I became interested in data compression trying to help a brilliant friend Kelly D. Crawford PH.D. over come his abuse of alchohol. Thinking that if he took me on as a student he could hang on long enough to get help. He was programmer for many well known companies. We tackled data compression for 8 long years. My job was to come up with out of the box ideas, he would code. I started from knowing nothing. Just as we were making a breakthrough, he passed from liver problems. This was over a year ago, while looking over some of my notes and saw many things that I still think are possibilities, but I am not a programmer, aside from taking basic back in 1980. Came here to learn and perhaps connect with a programmer who would like to take a look at some of the out of the box ideas for random data compression.....pseudo random data, in particular. Jon
    18 replies | 247 view(s)
  • skal's Avatar
    13th July 2020, 16:12
    skal replied to a thread JPEG XL vs. AVIF in Data Compression
    You should try with another test image, Lena has quite some problems (starting with being a bad old scan!). Not that i expect different results, but Lena should really be forgotten for tests... ​
    27 replies | 1605 view(s)
  • Darek's Avatar
    13th July 2020, 16:12
    Darek replied to a thread paq8px in Data Compression
    Yes. However - for such a big change I'll test all parameters: 1) best option from "t", "a", "e" and "l" combinations to find best set of option to file 2) memory usage effect -> I'll test also (as usual because I do it for all versions) options with level 4 to 12 and for this version options 1 to 3 Then in most probably I'll find the best scores for my testet plus 4 corpuses. In general effectiveness is very high - as I mention some files are compress better than cmix with 2x (at least) speedup. As I see, from my testeset it looks loke bigger files got worse scores than cmix (K.WAD for example), however there are some advantages llike image parsers (generally for all and also like LZ77 precompressed image files = E.TIF) which provides to better score at all compared to cimix. In my opinion, there are also some additional things to improve of course = > especially Kaitz with some other users made very serious improvements like (of course if he want to share his ideas): 1) better 24bpp parser/optimization which provides to better scoes w/o LSTM. 2) very good impovement of big textual and html files compression like (made for enwik testset but works sometimes with other files) -w option and usage of external dictionary which is included into file. 3) some additional parsers like DEC Alpha which improves about 500KB in mozilla file. 4) some ppmd improvements like LucaBiondi made - it's nice additional gain due to some ppmd tune. 5) and last but not least, from some reasons for my testset file paq8px v184 have better compression for audio files than paq8px v188 with LSTM.
    1958 replies | 548120 view(s)
  • Gotty's Avatar
    13th July 2020, 14:18
    Gotty replied to a thread paq8px in Data Compression
    Thank you Darek! I see you are testing with -9 memory level. I agree: this way results are comparable to earlier tests. (But not comparable to cmix results, which is OK, I guess.)
    1958 replies | 548120 view(s)
  • Bulat Ziganshin's Avatar
    13th July 2020, 13:59
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    Thank you for extremely quick check. For me, the speed loss looks counter-intuitive - on Intel CPUs, AND can be performed on any of 4 ALUs, while SHL can be perfomed only on 2 ALUs, so AND shouldn't be any worse May be it will be different on other non-ARM cpus, in particular AMD Zen
    452 replies | 133287 view(s)
  • DZgas's Avatar
    13th July 2020, 11:54
    Hahaha, rude.
    24 replies | 1256 view(s)
  • ivan2k2's Avatar
    13th July 2020, 11:25
    There are 2 versions of calgary corpus, one with 14 files and other with 18 files. You can use whatever you want)
    1 replies | 117 view(s)
  • lz77's Avatar
    13th July 2020, 10:49
    I wanted to download calgary.tar, I found via Google https://en.wikipedia.org/wiki/Calgary_corpus#External_links The ftp link "Original home of the Calgary Corpus" did not work, I've used "New home" instead: http://corpus.canterbury.ac.nz/descriptions/#calgary But this archive is differerent from http://mattmahoney.net/dc/calgary.tar I think the archive on mattmahoney.net is true. Give true links on most common corpuses (with its MD5/SHA-1) for benchmark, thanks.
    1 replies | 117 view(s)
  • Darek's Avatar
    13th July 2020, 10:07
    Darek replied to a thread paq8px in Data Compression
    Scores of my testset for paq8px_v188 with -l option (comparison w and w/o) => about 100KB of gain for almost all files and a set new overall records for C.TIF, D.TGA, E.TIF, I.EXE, L.PAK, S.DOC and T.DOC! With paq8pxd which hold a records for 24bpp images (A.TIF and B.TGA) paq family holds 16 records on 28 files! cmix got the rest... :)
    1958 replies | 548120 view(s)
  • Gotty's Avatar
    13th July 2020, 09:27
    Gotty replied to a thread paq8px in Data Compression
    @mpais: Oh, welcome back! Nice to see you again! Thank you for pulling/pushing my latest version. I appreciate it! Yes, that is what happened (after playing around I forgot to put the line back). I was not in a hurry to fix it, I had no idea someone is preparing something in the background ;-)
    1958 replies | 548120 view(s)
  • Jarek's Avatar
    13th July 2020, 08:46
    Jarek replied to a thread JPEG XL vs. AVIF in Data Compression
    Chrome and Firefox are getting support for the new AVIF image formathttps://www.zdnet.com/article/chrome-and-firefox-are-getting-support-for-the-new-avif-image-format/ https://news.slashdot.org/story/20/07/09/202235/chrome-and-firefox-are-getting-support-for-the-new-avif-image-format
    27 replies | 1605 view(s)
  • Cyan's Avatar
    13th July 2020, 05:57
    Cyan replied to a thread Zstandard in Data Compression
    Yes, the comment referred to the suggested hash function. Indeed, the `lz4` hash is different, using a double-shift instead. Since mixing of high bit seems a bit worse for the existing `lz4` hash function, it would implied that the newly proposed hash should perform better (better spread). And that's not too difficult to check : replace one line of code, and run on a benchmark corpus (important : have many different files of different types). Quite quickly, it appears that this is not the case. The "new" hash function (a relative of which used to be present in older `lz4` versions), doesn't compress better, in spite of the presumed better mixing. At least, not always, and not predictably. I can find a few cases where it compresses better : x-ray (1.010->1.038), ooffice (1.414 -> 1.454), but there are also counter examples : mr (1.833 -> 1.761), samba (2.800 -> 2.736), or nci (6.064->5.686). So, on first approximation, differences are mostly attributed to "noise". I believe a reason for this outcome is that the 12-bit hash table is already over-saturated, so it doesn't matter that a hash function has "better" mixing: all positions in the hash are already in use and will be overwritten before their distance limit. Any "reasonably correct" hash is good enough with regards to this lossy scheme (1-slot hash table). So, why selecting one instead of the other ? Well, speed becomes the next differentiator. And in this regards, according to my tests, there is really no competition : the double shift variant is much faster than the mask variant. I measure a 20% speed difference between the two, variable depending on source file, but always to the benefit of the double shift variant. I suspect the speed advantage is triggered by more than just the instructions spent for the hash itself. It seems to "blend" better with the rest of the match search, maybe due to instruction density, re-use of intermediate registers, or impact on match search pattern. Whatever the reason, the different is large enough to tilt the comparison in favor of the double-shift variant.
    452 replies | 133287 view(s)
  • SolidComp's Avatar
    13th July 2020, 01:39
    Being "free" or not doesn't matter much for a lot of use cases and industries. Some people have an ideological obsession with "free" software for some reason, as opposed to free furniture or accounting services, etc. Lots of industries will pay for a good video compression codec, if it's only a dollar or two per unit. All the TV and AV companies pay, and AVC and HEVC have been hugely successful. It's mostly just browser makers who don't want to pay, so AV1 seems focused on the web.
    24 replies | 1256 view(s)
  • Bulat Ziganshin's Avatar
    13th July 2020, 01:32
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    sorry, I can't edit the post. I thought that Cyan ansered me, but it seems that he answered algorithm
    452 replies | 133287 view(s)
  • Bulat Ziganshin's Avatar
    13th July 2020, 01:27
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    you are right, except that it's opposite Imagine word '000abcde' (0 here represent zero byte). The existing code shifts it left, so it becomes 'abcde000' and then multiplies. As result, the first data byte, i.e. 'a' can influence only the highest byte of multiplication result. In the scheme I propose, you multiply '000abcde' by constant, so byte 'a' can influence 4 higher bytes of the result Note that you do it right way on motorola-endian architectures, this time using (sequence >> 24) * prime8bytes
    452 replies | 133287 view(s)
  • suryakandau@yahoo.co.id's Avatar
    13th July 2020, 01:17
    Fp8sk15 - improve image24 compression ratio here is the source and binary file
    34 replies | 1407 view(s)
  • moisesmcardona's Avatar
    12th July 2020, 23:20
    moisesmcardona replied to a thread paq8px in Data Compression
    CMake.
    1958 replies | 548120 view(s)
  • schnaader's Avatar
    12th July 2020, 22:38
    schnaader replied to a thread paq8px in Data Compression
    The source code is at Github: https://github.com/hxim/paq8px
    1958 replies | 548120 view(s)
  • suryakandau@yahoo.co.id's Avatar
    12th July 2020, 22:37
    and how to compile it using g++ ?
    1958 replies | 548120 view(s)
  • suryakandau@yahoo.co.id's Avatar
    12th July 2020, 22:27
    ​where can i get the source code ?
    1958 replies | 548120 view(s)
More Activity