Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • Jarek's Avatar
    Today, 08:46
    Jarek replied to a thread JPEG XL vs. AVIF in Data Compression
    Chrome and Firefox are getting support for the new AVIF image formathttps://www.zdnet.com/article/chrome-and-firefox-are-getting-support-for-the-new-avif-image-format/ https://news.slashdot.org/story/20/07/09/202235/chrome-and-firefox-are-getting-support-for-the-new-avif-image-format
    21 replies | 1327 view(s)
  • Cyan's Avatar
    Today, 05:57
    Cyan replied to a thread Zstandard in Data Compression
    Yes, the comment referred to the suggested hash function. Indeed, the `lz4` hash is different, using a double-shift instead. Since mixing of high bit seems a bit worse for the existing `lz4` hash function, it would implied that the newly proposed hash should perform better (better spread). And that's not too difficult to check : replace one line of code, and run on a benchmark corpus (important : have many different files of different types). Quite quickly, it appears that this is not the case. The "new" hash function (a relative of which used to be present in older `lz4` versions), doesn't compress better, in spite of the presumed better mixing. At least, not always, and not predictably. I can find a few cases where it compresses better : x-ray (1.010->1.038), ooffice (1.414 -> 1.454), but there are also counter examples : mr (1.833 -> 1.761), samba (2.800 -> 2.736), or nci (6.064->5.686). So, on first approximation, differences are mostly attributed to "noise". I believe a reason for this outcome is that the 12-bit hash table is already over-saturated, so it doesn't matter that a hash function has "better" mixing: all positions in the hash are already in use and will be overwritten before their distance limit. Any "reasonably correct" hash is good enough with regards to this lossy scheme (1-slot hash table). So, why selecting one instead of the other ? Well, speed becomes the next differentiator. And in this regards, according to my tests, there is really no competition : the double shift variant is much faster than the mask variant. I measure a 20% speed difference between the two, variable depending on source file, but always to the benefit of the double shift variant. I suspect the speed advantage is triggered by more than just the instructions spent for the hash itself. It seems to "blend" better with the rest of the match search, maybe due to instruction density, re-use of intermediate registers, or impact on match search pattern. Whatever the reason, the different is large enough to tilt the comparison in favor of the double-shift variant.
    451 replies | 133023 view(s)
  • SolidComp's Avatar
    Today, 01:39
    Being "free" or not doesn't matter much for a lot of use cases and industries. Some people have an ideological obsession with "free" software for some reason, as opposed to free furniture or accounting services, etc. Lots of industries will pay for a good video compression codec, if it's only a dollar or two per unit. All the TV and AV companies pay, and AVC and HEVC have been hugely successful. It's mostly just browser makers who don't want to pay, so AV1 seems focused on the web.
    17 replies | 869 view(s)
  • Bulat Ziganshin's Avatar
    Today, 01:32
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    sorry, I can't edit the post. I thought that Cyan ansered me, but it seems that he answered algorithm
    451 replies | 133023 view(s)
  • Bulat Ziganshin's Avatar
    Today, 01:27
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    you are right, except that it's opposite Imagine word '000abcde' (0 here represent zero byte). The existing code shifts it left, so it becomes 'abcde000' and then multiplies. As result, the first data byte, i.e. 'a' can influence only the highest byte of multiplication result. In the scheme I propose, you multiply '000abcde' by constant, so byte 'a' can influence 4 higher bytes of the result Note that you do it right way on motorola-endian architectures, this time using (sequence >> 24) * prime8bytes
    451 replies | 133023 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Today, 01:17
    Fp8sk15 - improve image24 compression ratio here is the source and binary file
    33 replies | 1258 view(s)
  • moisesmcardona's Avatar
    Yesterday, 23:20
    moisesmcardona replied to a thread paq8px in Data Compression
    CMake.
    1948 replies | 547262 view(s)
  • schnaader's Avatar
    Yesterday, 22:38
    schnaader replied to a thread paq8px in Data Compression
    The source code is at Github: https://github.com/hxim/paq8px
    1948 replies | 547262 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 22:37
    and how to compile it using g++ ?
    1948 replies | 547262 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 22:27
    ​where can i get the source code ?
    1948 replies | 547262 view(s)
  • DZgas's Avatar
    Yesterday, 22:21
    VVC has patents - well, use AV1! AVC / HEVC and most likely VVC - have very complicated coding algorithms and their ~extremely complex settings. For easy be Presets that are the settings for hard settings. Movie or animation presets. VP9 had just one button - to encode quickly, slowly, very slowly. It's all. AV1 is same, variant encoding speed. I think developers from all companies selected the most multi-complex algorithms for all cases, so that encoding algorithms work equally well in every situation I think that’s why VVC will have a microscopic advantage for Specifically Targeted Products.
    17 replies | 869 view(s)
  • Jarek's Avatar
    Yesterday, 22:14
    Oh, indeed - I got that feeling from checking a few years ago, but I see it has improved ~2018. ps. Just noticed that there is also EVC/MPEG-5 coming this year: https://en.wikipedia.org/wiki/Essential_Video_Coding
    17 replies | 869 view(s)
  • Cyan's Avatar
    Yesterday, 21:38
    Cyan replied to a thread Zstandard in Data Compression
    In this construction, top bit of `sequence` contributes to the entire hash. This feels intuitively better mixed than other hash formulas where the top bit only contributes to the top bit of the hash.
    451 replies | 133023 view(s)
  • Darek's Avatar
    Yesterday, 21:34
    Darek replied to a thread paq8px in Data Compression
    Are you absolutely right - I'm testing 3 runs at once then I'm comparing non "l" mode with "l" mode with both slower timings (however both tests made in the same time and similar conditions - due to fact that I need also to test influence of "t", "e", "a" and "clear" options on LSTM and made some additional instances) - then as I wrote - I'll need to test it further deeper due to fact that slowdowns could be different for different instances. My comparison is only raffle. But the scores are good.... :)
    1948 replies | 547262 view(s)
  • mpais's Avatar
    Yesterday, 21:24
    mpais replied to a thread paq8px in Data Compression
    I'm guessing that, even though your laptop has an AVX2 capable CPU, you're testing multiple simultaneous runs at lower (-9?) levels? That will absolutely destroy performance, since AVX2 code really makes CPUs run hot, hence you get thermal throttling, and the required memory bandwidth performance just isn't there either. I've seen the same thing happen in my testing, running 5 instances at once makes all of them over 2x slower :rolleyes: That's why I made it optional, it's just there in case you guys want to try it out and play with its settings.
    1948 replies | 547262 view(s)
  • moisesmcardona's Avatar
    Yesterday, 21:12
    moisesmcardona replied to a thread paq8px in Data Compression
    I have submitted a Pull Request updating the CMakeLists file to compile using CMake/GCC. It compiled successful on GCC 10.1.0 gcc version 10.1.0 (Rev3, Built by MSYS2 project)
    1948 replies | 547262 view(s)
  • Darek's Avatar
    Yesterday, 21:05
    Darek replied to a thread paq8px in Data Compression
    I've started to test my testset. It looks like -l option is 5-10 times slower than non lstm on my laptop. I need to test timings deeper. For first files some scores are better than cmix with 50% of time spend on compression...
    1948 replies | 547262 view(s)
  • mpais's Avatar
    Yesterday, 20:39
    mpais replied to a thread paq8px in Data Compression
    The LSTM model, if enabled, is used with every block type. Since LSTMs are very slow learners, we don't just use its prediction, we use the current expected symbol (at every bit) and current epoch, along with an indirect context, as additional context for mixer contexts and APM contexts. In testing this significantly improved compression on relatively small files (upto 0.3% on small x86/64 executables, 0.13% on average on my testset). The code was written with C++11 in mind, since that's what's used in cmix, so on VS2019 with C++17 you'll get nagged that std::result_of is deprecated, and it doesn't use std::make_unique. The memory usage is also not reported to the ProgramChecker. I tried to make it reasonably modular, so it would be easy to exchange components and tweak parameters. Included are 2 learning rate decay strategies (PolynomialDecay and CosineDecay), 4 activations functions (Logistic, Tanh, HardSigmoid, HardTahn) and the optimizer used is Adam, though I also implemented others (Nadam, Amsgrad, Adagrad and SGD with(out) Nesterov Accelerated Gradient), just not with AVX2 instructions and none gave better results than Adam anyway. The current configuration is 2 layers of 200 cells each, and horizon 100, as in cmix, but with a different learning rate and gradient clip threshold. You can also configure each gate independently. I also omitted unused code, like the functions to save and load the weights (compatible with cmix), since those just dump the weights without any quantization. In the future, I'd like to try loading pre-trained language models with quantized 4 bit weights, possibly using N different LSTM models and using just the one for the current detected text language. The problems with that approach are finding large, good, clean datasets that are representative of each language, and the required computing power needed to train a model on them.
    1948 replies | 547262 view(s)
  • algorithm's Avatar
    Yesterday, 20:18
    algorithm replied to a thread Zstandard in Data Compression
    No its not. Multiplicative hash function works better when bits are in the upper portion. So its about compression ratio. EDIT: Sorry I am wrong. The final shift is important for compression. But note that it is 5 byte hash and not 4.
    451 replies | 133023 view(s)
  • Bulat Ziganshin's Avatar
    Yesterday, 19:31
    Bulat Ziganshin replied to a thread Zstandard in Data Compression
    I've not found thread dedicated to LZ4, and anyway there is similar code on ZSTD too: https://github.com/lz4/lz4/blob/6b12fde42a3156441a994153997018940c5d8142/lib/lz4.c#L648 It looks that it was optimized for ARM CPUs that supprt built-in bitshift in many ALU commands, am I right? I think that on x64 it would be both faster and provide better hash quality: return (U32)(((sequence & 0xFFFFFFFF) * prime8bytes) >> (64 - hashLog));
    451 replies | 133023 view(s)
  • SolidComp's Avatar
    Yesterday, 19:09
    Jarek, HEVC isn't unused. It's widely used. It's the format used for Ultra HD Blu-ray, which is awesome. It's also used in streaming 4K content on a lot of platforms, and is supported by most 4K TVs and streaming devices. HEVC is much better than VP9, for reasons I don't understand. So it won this round. It's also not clear that VVC will have the same licensing issues that HEVC had. The Wikipedia article isn't written in an encyclopedic tone, and there's no explanation behind the opinions expressed by whoever wrote that part.
    17 replies | 869 view(s)
  • SolidComp's Avatar
    Yesterday, 19:01
    How did they beat AV1 so handily? I'm interested in the social science and cognitive science of both software development and domain-specific areas like video compression technology and theory. How do you think they beat AV1? Who was responsible? Was it likely to be a single person driving the technological advances, a small team, or a large team? Did they have to spend a lot of money developing VVC, you think? Is it the kind of thing where they'd have to recruit brilliant software engineers and video compression experts and pay them huge salaries and bonuses? I mean, there's no equity opportunity in working for some industry forum or coalition, no stock options and so forth. It's not like working for a Silicon Valley company. I wonder how the talent acquisition works. And the management and direction.
    17 replies | 869 view(s)
  • Sportman's Avatar
    Yesterday, 19:01
    With this three tools you can disable most Windows background activities: https://www.oo-software.com/en/shutup10 https://www.w10privacy.de/english-home/ https://docs.microsoft.com/en-us/sysinternals/downloads/autoruns Run as admin and reboot after change.
    3 replies | 98 view(s)
  • Darek's Avatar
    Yesterday, 18:29
    Darek replied to a thread paq8px in Data Compression
    Welcome back mpais! :) One question - is it LSTM works with all datatypes on only with standard/text parts? OK, don't answer, I know - it works.
    1948 replies | 547262 view(s)
  • mpais's Avatar
    Yesterday, 17:36
    mpais replied to a thread paq8px in Data Compression
    Changes: - New LSTM model, available with the option switch "L" This is based on the LSTM in cmix, by Byron Knoll. I wanted to play with it a bit but testing with cmix is simply too slow, so I ported it to paq8px and tried to speed it up with AVX2 code. Since we're using SIMD floating point code, files created using the AVX2 code path and the normal code path are incompatible, as expected. In testing the AVX2 version makes paq8px about 3.5x slower, so at least not as bad as cmix. Not sure that's much of a win... Note: The posted executable is based on v187fix5, which was the most up-to-date version when I was testing. The pull request on the official repository is based on 187fix7, Gotty's latest version, but something in that version broke the x86/64 transform, and I don't really have the time to check all the changes made between those versions. EDIT] Found the bug, Gotty seems to have mistakenly deleted a line in file filter/exe.hpp: uint64_t decode(File *in, File *out, FMode fMode, uint64_t size, uint64_t &diffFound) override { ... size -= VLICost(uint64_t(begin)); ... }
    1948 replies | 547262 view(s)
  • DZgas's Avatar
    Yesterday, 17:06
    DZgas replied to a thread JPEG XL vs. AVIF in Data Compression
    JpegXL at maximum presset is so slow, same AVIF. For me - they are both the same Quality and Speed. But jpegXL cannot strongest compress...
    21 replies | 1327 view(s)
  • compgt's Avatar
    Yesterday, 17:02
    Unless they really got one, tech giants like Google, Microsoft, Facebook etc. should buy any breakthrough (fast) data compression algorithm that comes along, like with 90% or more compression ratio.
    4 replies | 62 view(s)
  • DZgas's Avatar
    Yesterday, 16:45
    DZgas replied to a thread JPEG XL vs. AVIF in Data Compression
    With this quality (1 BPP) JpegXL is indistinguishable from AVIF.
    21 replies | 1327 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 14:57
    What happens if you compress to 32 kB, i.e., 1 BPP. Currently Internet compression averages at 2-3 BPP and cameras at 4-5 BPP. JPEG XL attempts to create a good result if you ask it to compress at distance 1.
    21 replies | 1327 view(s)
  • Jarek's Avatar
    Yesterday, 13:38
    Here is some March 2020 benchmark paper https://arxiv.org/abs/2003.10282 Claim ~25-30% coding gain from HEVC (5-10% for AV1) ... in a bit more than half of complexity of AV1 ... Looks good if fair ... but with even greater licensing issues than nearly unused HEVC.
    17 replies | 869 view(s)
  • paleski's Avatar
    Yesterday, 13:23
    ​ Apparently 50% reduction is rather optimistic (marketing) statement, maybe can be reached only in some selected cases with subjectively perceived quality. 30%-40% reduction sounds more realistic, at least for the first waves of encoders. Earlier this year bitmovin.com posted an article mentioning VVC test results compared to HEVC: similar PSNR values were achieved while reducing the required bandwidth by roughly 35%.
    17 replies | 869 view(s)
  • DZgas's Avatar
    Yesterday, 13:08
    DZgas replied to a thread JPEG XL vs. AVIF in Data Compression
    I finally got time to cjpegXL_jxl 0.0.1-e3c58a0a and did tests with strong compression...the result is strange. JpegXL lost to all and brother 10 years ago - JpegXR. Maybe I'm doing something wrong, or the codec is doing something wrong, or JpegXL can’t compress in this range...
    21 replies | 1327 view(s)
  • Gotty's Avatar
    Yesterday, 12:35
    No "inventor" can patent a thing that was not invented. In the patent application you need to describe how your invention works. And since there is no such invention, they cannot describe it, and so they cannot patent is. Hmmm... where did you see? My conclusion: no, they can't sue you.
    4 replies | 62 view(s)
  • Jarek's Avatar
    Yesterday, 12:28
    From https://en.wikipedia.org/wiki/Versatile_Video_Coding Created new group to reduce licensing problems ... so now there are 4 companies fighting for licenses on one codec - brilliant! :D ps. While they say it is completed, I see the ISO standard (ISO/IEC DIS 23090-3) is "under development" in the Enquiry stage (between Committee and Approval): https://www.iso.org/standard/73022.html
    17 replies | 869 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 11:35
    Fp8sk14 - use -9 option in this version here is source code and binary file @gotty could you add upto -15 option for fp8sk ?? thank you!!
    33 replies | 1258 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 10:55
    What happens if you build it from https://jpeg.org/jpegxl/ Please file an issue there if a compilation does not produce a result that you are happy with. At the moment we have the best ability to improve on issues found out with a linux compilation. We will improve on our cross-platform support soon.
    38 replies | 3324 view(s)
  • randomthing's Avatar
    Yesterday, 10:43
    thanks Shelwien for your comment, and your kind information i didnt mean that those company will be affect something for that compression engine, i just mean that will they sue for the invention? i saw that already big corporation patent for random compression or recursive compression thats why i ask will they sue the author who invent that thing?
    4 replies | 62 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 10:32
    this is the correct binary file ​
    33 replies | 1258 view(s)
  • Shelwien's Avatar
    Yesterday, 10:31
    There'd be nothing special. Even if there was a way to compress any file to a 128-bit hash (like, based on a time machine), it would only matter if its faster than internet. Actually even as it is, there's a lot of known redundancy in stored and transmitted content, since its compressed with inefficient methods (deflate etc), or isn't compressed at all. We could save maybe 30% of all traffic and storage if there was a strict enforcement of best compression everywhere. And even with a random compression method which is both fast, cheap and reliable, it still won't affect the corporations you mentioned. Sure, some storage companies may go bankrupt (WD, Seagate etc), but for google and microsoft... it would just allow to save some money on storage.
    4 replies | 62 view(s)
  • randomthing's Avatar
    Yesterday, 10:10
    Hey Geeks, we know that Random Compression is impossible because Mathmetics already prove this thing.. but however lets take this as a science fiction that someone came up with a Compression Engine that Compress Random Data with recursive functionality, then what happend next? i mean Big Big corporation like ibm,google,microsfot e.t.c what they do? will they sue the author who discover the engine? or they used the engine to reeduce their database cost or other thing like streaming e.t.c ?
    4 replies | 62 view(s)
  • Shelwien's Avatar
    Yesterday, 09:49
    There's a native GUI, but only 32-bit version (nanozip.exe in win32 archive): http://nanozip.ijat.my/
    304 replies | 320931 view(s)
  • paleski's Avatar
    Yesterday, 09:04
    Is there some kind of GUI for NanoZip available? Or as a plugin for other archiver utilities, file browsers, etc?
    304 replies | 320931 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 07:23
    Fp8sk13 the fastest paq8 version ever fp8sk13 -1 enwik9 on my old laptop (intel core i5 3210M 2.5GHz 16Gb RAM) ​Total 1000000000 bytes compressed to 202158507 bytes.Time 5389.15 sec, used 850842021 bytes of memory:_superman2: i wonder if it runs under intel core i7 with higher GHz could it be faster ?? this is the correct binary file ​​
    33 replies | 1258 view(s)
  • Trench's Avatar
    Yesterday, 06:59
    pacalovasjurijus Game maker program to make games without programming website drag and drop programs to not need to program cheat engine to edit game CODE to not need to program HEX editor to edit any file without knowing how to program etc obviously they have a limit but can do something with them
    22 replies | 602 view(s)
  • Gotty's Avatar
    Yesterday, 02:22
    I disabled windows update looong time ago (I'm still on 1803, maybe it's time to move on...). Not just disabled updates, but prevented windows to turn it back on again. No popups. No searching for updates in the background. No sudden restarts. :_secret2: In the registry modify the startup type of the wuauserv service to disabled and remove all permissions for this key - except for yourself. This way windows cannot turn it back on again.
    3 replies | 98 view(s)
  • Shelwien's Avatar
    Yesterday, 01:20
    With disabled updates it stops doing that.
    3 replies | 98 view(s)
  • suryakandau@yahoo.co.id's Avatar
    11th July 2020, 17:08
    sorry for late post here is the source code
    33 replies | 1258 view(s)
  • moisesmcardona's Avatar
    11th July 2020, 15:32
    moisesmcardona replied to a thread Fp8sk in Data Compression
    Source code..... (Could we ban him for violating the GPL rules?)
    33 replies | 1258 view(s)
  • Gotty's Avatar
    11th July 2020, 13:38
    Gotty replied to a thread ARM vs x64 in The Off-Topic Lounge
    Oh, no problem. You have to know, I don't know much about the topic. I felt the 1st video a bit (?) biased, but since I didn't find hard numbers that clearly supports or refutes the claims (the full picture is still very blurry)... I thought I'd post these - could be interesting for the readers of the thread. Thank you for posting your view.
    13 replies | 770 view(s)
  • paleski's Avatar
    11th July 2020, 09:16
    Maybe here: http://forum.doom9.org/showpost.php?p=1916331&postcount=281
    38 replies | 3324 view(s)
  • Piotr Tarsa's Avatar
    11th July 2020, 08:24
    (Sorry if this post is a little rude, but I'm fed up with baseless stories about ARM inferiority) Guy is spewing typical nonsense: - ARM can't have as high performance as x86 - ARM lacks some mysterious features that only x86 can have - ARM can't integrate with as much hardware as x86 - etc Where's any proof of that? The actual situation seems to be quite opposite: - ARM in the form of Apple Silicon has very high performance already and it's going up quickly. That is visible even in the first post here. - I haven't seen any example of functionalities that are possible on x86, but aren't on ARM. x86 prophets tells us otherwise, but is x86 a religion? You have access to ISAs (instruction set architecture) so you can find the mysterious features yourself, but there aren't any. - ARM can be integrated with hardware typically seen with x86, e.g. nVidia has full support for CUDA on ARM processors (currently nVidia supports x86, ARM and POWER architectures), nVidia Shield is an integration of ARM with GeForce, there are rumors of Samsung integrating RDNA (the new Radeon cores) in their upcoming ARM based smartphone SoCs, etc I'm mostly interested in any logical explanation on why ARM can't scale its performance up to the levels of x86 or above. No biased, unfounded, vague claims but actual technical analysis showing understanding of ARM and x86 architectures. Mac on ARM is an unknown so it's a perfectly logical idea to wait for independent benchmarks and see how the software we're interested in will perform on Apple Silicon based machines. Nothing shocking there. Same goes for choosing between AMD CPU and Intel CPU or AMD GPU and nVidia GPU.
    13 replies | 770 view(s)
  • Alexander Rhatushnyak's Avatar
    11th July 2020, 06:58
    Here is a prooflink, and a proofshot is attached. if you ever tried to prevent Windows 10 from rebooting for more than a week, please describe your experiences. Thank you!
    3 replies | 98 view(s)
  • Alexander Rhatushnyak's Avatar
    11th July 2020, 06:52
    Could anyone please build and provide cjpegxl and djpegxl executables? Either 32-bit or (better) 64-bit, either Windows or (better) Linux, for either earliest or latest Intel Core* CPUs ? I couldn't find any executables, and those that I built myself earlier this year were somehow a lot slower than expected (on my machines). Thank you in advance!
    38 replies | 3324 view(s)
  • suryakandau@yahoo.co.id's Avatar
    11th July 2020, 03:19
    Fp8sk12 -faster than previous version but worse compression ratio:) fp8sk12 -8 mixed40.dat is: Total 400000000 bytes compressed to 56415009 bytes.Time 3122.57 sec, used 1831298304 bytes of memory run under windows10 and intel core i5-3210M 2.5GHz and 16 Gb memory) if it run under Intel Socket 1151 Core i7-8700K (Coffee Lake, 3.7GHz, 6C12T, 95W TDP) how fast it could be ?? :_superman2:
    33 replies | 1258 view(s)
  • Dresdenboy's Avatar
    11th July 2020, 02:08
    There is the Horspool paper (Improving LZW), which describes it. I also checked his source. No additional information. But Charles Bloom did some interesting analysis of this encoding (calling it "flat codes" BTW): http://cbloomrants.blogspot.com/2013/04/04-30-13-packing-values-in-bits-flat.html
    11 replies | 740 view(s)
  • Dresdenboy's Avatar
    11th July 2020, 01:54
    Did you try it yet? Even Charles Bloom covered this encoding with some interesting analysis here. In another thread Mauro Vezossi brought up a LZW variant using this phased in binary encoding, called lzw-ab. He also did some tests:
    15 replies | 17315 view(s)
  • JamesB's Avatar
    11th July 2020, 00:32
    Ah Tom Scott. A nice basic explanation. Yes it misses a lot out, but that's not the point. I had to chuckle with the BBC Micros in the background. My very first compression program was a huffman compressor written in 6502 assembly on the BBC Micro. :-)
    3 replies | 223 view(s)
  • Jarek's Avatar
    10th July 2020, 23:19
    L1-PCA is not exactly what we want to here: lowest bits/pixel, so I have directly optimized: e(x) = sum_{d=1..3} lg(sum_i |x_id|) which gives literally sum of entropy of 3 estimated Laplace distributions: can be translated into approximated bits/pixel. For all images it lead to this averaged transform: 0.515424, 0.628419, 0.582604 -0.806125, 0.124939, 0.578406 0.290691, -0.767776, 0.570980 First vector (as row) is kind of luminosity (Y) - should have higher accuracy, the remaining correspond to colors - I would just use finer quantization for the first one and coarser for the remaining two. But we could also directly optimize both rotation and some perceptually chosen distortion evaluation instead - I have just written theory today and will update arxiv in a day or two.
    12 replies | 546 view(s)
  • Jyrki Alakuijala's Avatar
    10th July 2020, 22:51
    That's amazing. Congratulations! I didn't expect that one could improve it more than 5 % at corpus level without going to 10x slower decoding. Will be good to learn about it. I brought quite a few techniques myself into other codecs that I originally invented for WebP lossless. Entropy clustering I use in brotli, brunsli, and in JPEG XL with great success -- it is a wonderful little mechanism that makes entropy codes so much more flexible. My Select predictor perfomed better (better compression, faster because less branches) than Paeth in JPEG XL, too. WebP's color decorrelation system through the guidance image is used in JPEG XL. Some of my inventions for the WebP that didn't make it to JPEG XL include the 2d code for LZ77 distances and the color cache, but that is mostly because JPEG XL is less pixel-oriented than WebP lossless.
    170 replies | 42359 view(s)
  • pacalovasjurijus's Avatar
    10th July 2020, 22:24
    I think that need to move information and write the prefix code.
    45 replies | 5007 view(s)
  • Stefan Atev's Avatar
    10th July 2020, 21:20
    I don't have any disagreement with the fact that the optimal solution will optimize for both perceptual and coding considerations. It just seemed from your comment that you think a sequential optimization will work - first optimize directions for coding, then optimize quantization for perceptual. I think the parameters are coupled if you evaluate them with a perceptual metric, so a sequential optimization strategy seems to be a bit on the greedy side. Perhaps I misunderstood your explanation in the forum, I am reacting to the comments and have not read the paper carefully. I personally find the use of L1-PCA very rewarding, in ML for the longest time L2/Gaussians have been used not because people think they're accurate but because they are convenient to analyze/compute with. Then people will try to find exact solutions to crippled models instead of accepting sub-optimal solutions for a model that reflects reality closer (that's the Math bias in wanting convergence proofs / theoretical guarantees)
    12 replies | 546 view(s)
  • Gotty's Avatar
    10th July 2020, 20:09
    A couple of introductory videos: https://www.youtube.com/watch?v=JsTptu56GM8 How Computers Compress Text: Huffman Coding and Huffman Trees https://www.youtube.com/watch?v=TxkA5UX4kis Compression codes | Journey into information theory | Computer Science | Khan Academy https://www.youtube.com/watch?v=Lto-ajuqW3w Compression - Computerphile
    3 replies | 223 view(s)
  • Jarek's Avatar
    10th July 2020, 14:02
    http://mattmahoney.net/dc/dce.html
    3 replies | 223 view(s)
  • randomthing's Avatar
    10th July 2020, 13:56
    Hello, Please help me to provide some Details about how to start data compression practice, i mean what staff did i need to learn Data compression thing? Like What Type of math,theory,book,articale e.t.c need to start Data Compression? and the most important thing is what type of math i need to know before start Data compression? Thank You, have a good day
    3 replies | 223 view(s)
  • Dresdenboy's Avatar
    10th July 2020, 13:02
    @Stefan: 1. This sounds interesting, but might not work. I put len-1-matches into my list because several good LZ77-family compressors use them (with 4b offsets) to avoid the longer encoding of literals in case of no longer matches. You might also consider literal runs (early in a file) vs. match runs (1-n matches following). You might also check out the literal escaping mechanism of pucrunch. 2. Fibonacci sum calculation is a few instructions. But it quickly adds up. :) Gamma is cheap instead in code footprint aspects. Bit oriented encoding is also cheap in asm (the widely seen get_bit subroutines with a shift register, which also is being filled with bytes, or BT instructions).
    41 replies | 2832 view(s)
  • Sportman's Avatar
    10th July 2020, 12:48
    Yes, Kwe use separator with not used bytes to navigate.
    2 replies | 190 view(s)
  • xezz's Avatar
    10th July 2020, 12:13
    ​Is separator must exist in file?
    2 replies | 190 view(s)
  • Jarek's Avatar
    10th July 2020, 10:40
    For data compression applications, from aligning them with rotation there is a few percent size reduction - thanks to better agreement with 3 independent Laplace distributions (there is small dependence which can be included in width prediction to get additional ~0.3 bits/pixel further reduction) Agreement with assumed distribution is crucial for both lossy and lossless, log-likelihood e.g. from ML estimation is literally savings e.g. in bits/pixel, for disagreement we pay Kullback-Leibler bits/pixel. My point is that we should use both simultaneously: optimize accordingly to perceptual criteria, and also this "dataset axis alignment" for agreement with assumed distribution ... while it seems the currently used are optimized only for perceptual (?) To optimize for both simultaneously, the basic approach is just to choose three different quantization coefficients for the three axes, what is nearly optimal from bits/pixel perspective (as explained in previous post). But maybe it would be worth to also rotate such axes for perceptual optimization? It would need formalizing such evaluation ... Another question is orthogonality of such transform - if the three axes should be orthogonal? While it seems natural from ML perspective (e.g. PCA), it is not true for YCrCb nor YCoCg. But again to optimize for non-orthogonal there is needed some formalization of perceptual evaluation ... To formalize such evaluation, we could use distortion metric with weights in perceptually chosen coordinates ...
    12 replies | 546 view(s)
  • Stefan Atev's Avatar
    10th July 2020, 00:12
    ​ I am not sure that's the case - it can happen that the directions PCA picked are not well-aligned with "perceptual importance" directions, so to maintain preceptual quality you need good precision in all 3 quantized values; as an example, if the directions have the same weight for green, you may be forced to spend more bits on all three of them; or if the directions end up being equal angles apart from luma - same situation. I think for lossless it doesn't matter because your loss according to any metric will be zero - the difficulty is in having an integer transform that's invertible.
    12 replies | 546 view(s)
  • Sportman's Avatar
    9th July 2020, 23:55
    Kwe version 0.0.0.1 - keyword encoder Kwe encode keywords: kwe e input output kwe d input output There are 4 optional options (must be used all at once): - keyword separator (default 32 = space) - minimal keyword length (default 2 = min.) - maximal keyword length (default 255 = max.) - keyword depth (default 255 = max.) Command line version, can work with .NET framework 4.8. Very simple first version, there is no error handling and not well tested. Input can be any text file. Output can be compressed with an other archiver.
    2 replies | 190 view(s)
More Activity