Page 1 of 2 12 LastLast
Results 1 to 30 of 42

Thread: (Extremely) tiny decompressors

  1. #1
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts

    (Extremely) tiny decompressors

    What are the smallest decompressors that you know about? I am talking about smallest in size (e.g. the smallest number of bytes their machine code occupies). To the best of my knowledge, the best compression ratios for extremely small decompressor sizes tend to be achieved for very primitive LZ77/LZSS style compressors. Byte-aligned most likely. The shortest non-byte-aligned decompressors that I know are 30-40 bytes long. The shortest byte-aligned decompressors that I know are 20-30 bytes long (insane, right?). Two examples of such extremely compact decompressors are given by Charles Bloom here: http://cbloomrants.blogspot.com/2011...z-decoder.html

    In his post, Charles Bloom also gives an example of a Predictor-style decompressor, which is also under 20 bytes long. I cannot claim my experiments with Predictor-style compression have ever been impressive, but I am open to any ideas outside of LZ77/LZSS-style dictionary compression and am open to consider other compression algorithms (as long as decompression is simple enough). So,
    1) Please share if you know any extremely compact decompressors - under 40 bytes basically, - the platform does not really matter, but I would be surprised to see examples outsize of x86, m68k or older 8-bit CPUs.
    2) Do you know any algorithms whose decompressor would fit into 40 bytes of assembly code and would still be able to compete with LZ77/LZSS style compression?

  2. Thanks:

    Dresdenboy (6th June 2020)

  3. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,918
    Thanks
    291
    Thanked 1,274 Times in 720 Posts
    I think unBWT would fit in 40 bytes... not sure about actual compression though.
    Also a binary rangecoder might fit, so we could make an order2 CM maybe.

  4. Thanks:

    introspec (4th May 2020)

  5. #3
    Member
    Join Date
    Aug 2016
    Location
    USA
    Posts
    54
    Thanks
    13
    Thanked 20 Times in 15 Posts
    The shortest one I ever managed to produce back in the DOS era was 21 bytes for LZSS with 8 prefix bits packed into a byte and offset/match packed into 2 bytes (for a COM executable packer, so you could extract a few extra zeros at the end); a REP- prefixed bytewise copy is probably slow, but a tiny instruction;

  6. Thanks:

    introspec (4th May 2020)

  7. #4
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts
    Quote Originally Posted by Stefan Atev View Post
    The shortest one I ever managed to produce back in the DOS era was 21 bytes for LZSS with 8 prefix bits packed into a byte and offset/match packed into 2 bytes (for a COM executable packer, so you could extract a few extra zeros at the end); a REP- prefixed bytewise copy is probably slow, but a tiny instruction;
    Have you published it or used to pack anything that has been published? I am basically looking at various decompressors of this kind and trying to see if there are good format variations that can be built upon further.

  8. #5
    Member
    Join Date
    Aug 2016
    Location
    USA
    Posts
    54
    Thanks
    13
    Thanked 20 Times in 15 Posts
    Unfortunately the code has been lost long ago to hardrive failures; I was into the minuscule tiny demo "scene" in Bulgaria in the late 90s; If someone happens to have backups from the North Pole BBS run by Evil Yeti, that would be about it, I think I had written a demo for the BBS and he had the code... I am certain it was written in Turbo Assembler, Borland products ruled the day then. But I am confident something like this can be recovered; the principles are the same now as then:
    1. Pack 8 prefixes into a byte so you can unpack with a simple shift (and remember that condition flags are your friend)
    2. Be religious about the CX register, you want it for loop counts, match lengths, etc; definitely use REPMOVSB/ etc for the actual copy on match
    3. Know the size to unpack ahead of time, and stop decompression once your destination pointer crosses over a certain value;

    The compressor was really dumbly written at the time (I didn't understand suffix trees then), but functional. I think it used 12:4 split for offset/length, and never used length less than 3 so the real match lengths were 3-18 ; I think the compressor actually required that compressed + decompressed data fit in a 64K segment to make the jumping into decompressed code easier; that was also why it was really only used for COMs, they are flat and you have no loading to worry about like with EXEs.

  9. Thanks:

    introspec (4th May 2020)

  10. #6
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,918
    Thanks
    291
    Thanked 1,274 Times in 720 Posts

  11. Thanks:

    Stefan Atev (4th May 2020)

  12. #7
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts
    I hope you actually read it?

    I know Crinkler pretty well. However, its decompressor is 210-220 bytes long and trust me, it is not getting any slimmer soon.

  13. #8
    Member
    Join Date
    May 2020
    Location
    Berlin
    Posts
    26
    Thanks
    7
    Thanked 9 Times in 6 Posts
    Quote Originally Posted by introspec View Post
    What are the smallest decompressors that you know about? I am talking about smallest in size (e.g. the smallest number of bytes their machine code occupies). To the best of my knowledge, the best compression ratios for extremely small decompressor sizes tend to be achieved for very primitive LZ77/LZSS style compressors. Byte-aligned most likely. The shortest non-byte-aligned decompressors that I know are 30-40 bytes long. The shortest byte-aligned decompressors that I know are 20-30 bytes long (insane, right?). Two examples of such extremely compact decompressors are given by Charles Bloom here: http://cbloomrants.blogspot.com/2011...z-decoder.html

    In his post, Charles Bloom also gives an example of a Predictor-style decompressor, which is also under 20 bytes long. I cannot claim my experiments with Predictor-style compression have ever been impressive, but I am open to any ideas outside of LZ77/LZSS-style dictionary compression and am open to consider other compression algorithms (as long as decompression is simple enough). So,
    1) Please share if you know any extremely compact decompressors - under 40 bytes basically, - the platform does not really matter, but I would be surprised to see examples outsize of x86, m68k or older 8-bit CPUs.
    2) Do you know any algorithms whose decompressor would fit into 40 bytes of assembly code and would still be able to compete with LZ77/LZSS style compression?
    Thanks for opening this thread. I'm working on my own tiny decompression experiments. And for starters let me point you to Baudsurfer's (Olivier Poudade) assembly art section on his Assembly Language Page: http://olivier.poudade.free.fr/ (site seems a bit buggy sometimes), which has several tiny compressors and decompressors for different platforms.

  14. #9
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts
    Quote Originally Posted by Dresdenboy View Post
    And for starters let me point you to Baudsurfer's (Olivier Poudade) assembly art section on his Assembly Language Page: http://olivier.poudade.free.fr/ (site seems a bit buggy sometimes), which has several tiny compressors and decompressors for different platforms.
    Yes, thank you. I should have mentioned that when I gave my estimated tiny compressor sizes, I had a quick look in several places and definitely used Baudsurfer's page for reference. Unfortunately, his collection of routines is not very systematic (in the sense that I know better examples for at least some CPUs, e.g. Z80), so I am hoping that a bit more representative collection of examples can be gradually accumulated here.

  15. #10
    Member
    Join Date
    May 2020
    Location
    Berlin
    Posts
    26
    Thanks
    7
    Thanked 9 Times in 6 Posts
    Quote Originally Posted by introspec View Post
    Yes, thank you. I should have mentioned that when I gave my estimated tiny compressor sizes, I had a quick look in several places and definitely used Baudsurfer's page for reference. Unfortunately, his collection of routines is not very systematic (in the sense that I know better examples for at least some CPUs, e.g. Z80), so I am hoping that a bit more representative collection of examples can be gradually accumulated here.
    No problem. Accumulating this information sounds useful. I also collected a lot of information and did some analysis both of encoding ideas for small executables and existing compressors. Recently I stumbled over ZX7 and related to it: Saukav. The latter one is cool as it creates a decompressor based on the actual compression variant and parameters. Before finding it I already deemed this a necessity to keep sizes small. Especially for tiny intros, where a coder could omit some of the generic compressor code (code mover, decompression to original address) to save even more bytes (aside from coding compression-friendly).

    Here is an example with some of the tested compression algorithms (sizes w/o decompressor stubs and other data blocks, e.g. in the PAQ archive file format), leaving out all samples with less than 5% reduction, as they might be compressed already:
    Click image for larger version. 

Name:	Figure_1_compression_ratios.png 
Views:	60 
Size:	74.3 KB 
ID:	7636

    Also interesting would be the total size incl. decompressor (not done yet). In this case we might just see different starting offsets (decompressor stub, tables etc.) on Y axis and different gradients with increasing X.

  16. Thanks:

    introspec (2nd June 2020)

  17. #11
    Member
    Join Date
    May 2020
    Location
    Berlin
    Posts
    26
    Thanks
    7
    Thanked 9 Times in 6 Posts
    Quote Originally Posted by introspec View Post
    I hope you actually read it?

    I know Crinkler pretty well. However, its decompressor is 210-220 bytes long and trust me, it is not getting any slimmer soon.
    As you're mentioning Crinkler: I had some interesting discussion with Ferris, who created Squishy. He uses a small decompressor to decompress the actual decompressor. He said, the smaller one is about 200B. Then there is xlink, where unlord planned to have a talk at Revision Online 2020, but which has been cancelled. But there seems to be some progress, which he didn't publish yet. This might also be interesting to watch.

    BTW my smallest decompression loop (for small data sizes and only match length of 2) is 12B. Making it more generic for offsets "blows" it up to 17B. Typical LZ with multiple lengths is starting at ~20B depending on variant and assumptions. There likely are similarities to Stefan Atev's lost one. But I have to continue testing all those variants (with their respective encoders) to publish more about them.

    Another idea was to encode some x86 specific prefix. Such a prefix emitter can be as small as 15B.

  18. Thanks:

    introspec (2nd June 2020)

  19. #12
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    841
    Thanks
    241
    Thanked 308 Times in 184 Posts
    Smallest decoder for general purpose decoding just starts executing the compressed signal.

  20. #13
    Member
    Join Date
    Aug 2016
    Location
    USA
    Posts
    54
    Thanks
    13
    Thanked 20 Times in 15 Posts
    Quote Originally Posted by Dresdenboy View Post
    Especially for tiny intros, where a coder could omit some of the generic compressor code (code mover, decompression to original address) to save even more bytes (aside from coding compression-friendly).
    That was my experience for sure, I think I had to just make sure decompression started on a 16B-aligned offset so you could later just bump a segment register to point to the decompressed when you jump to it.

  21. #14
    Member
    Join Date
    May 2020
    Location
    Berlin
    Posts
    26
    Thanks
    7
    Thanked 9 Times in 6 Posts
    Quote Originally Posted by Stefan Atev View Post
    That was my experience for sure, I think I had to just make sure decompression started on a 16B-aligned offset so you could later just bump a segment register to point to the decompressed when you jump to it.
    Interesting idea. So is it pop cs as a jump instruction? With a COM executable and initial registers set to DS/CS (and useful constants like -2 or 100h in others) this sounds good.

  22. #15
    Member
    Join Date
    May 2020
    Location
    Berlin
    Posts
    26
    Thanks
    7
    Thanked 9 Times in 6 Posts
    Quote Originally Posted by introspec View Post
    1) Please share if you know any extremely compact decompressors - under 40 bytes basically, - the platform does not really matter, but I would be surprised to see examples outsize of x86, m68k or older 8-bit CPUs.
    2) Do you know any algorithms whose decompressor would fit into 40 bytes of assembly code and would still be able to compete with LZ77/LZSS style compression?
    Adding to 1): 6502's ISA likely is too limited to do sth useful in <40B. BTW is there a comprehensive list of decompressor sizes and algos somewhere? You mentioned several ones in your Russian article.
    Adding to 2): Chances of being able to compete are diminishing quickly with less code. Compressed data size + decompressor might be an interesting metric. But I got an idea to do some LZ78 variant. Maybe it stays below 40B.

  23. #16
    Member
    Join Date
    Jan 2020
    Location
    Fl
    Posts
    33
    Thanks
    0
    Thanked 2 Times in 2 Posts
    A file like
    1234567890abcdefghij
    is
    Size = 20 bytes (20 bytes)
    size on disk = 4.00 KB (4,096 bytes)

    But if you have the file name be 1234567890abcdefghij.txt and erase the file content then you have 0 bytes as cheezy as that sounds.
    i assume it would still take the same size on your hard drive even if the file is 0 bytes.
    Even i the file was 0 Bytes the file compression goes as low as 82 bytes, and with the 20 bytes its 138, 136 if its 1 letter 20 times. Sure it has some pointless info in the compressed file just like how many pictures also have pointless info.
    Example
    "7z¼¯' óò÷6 2 ÷|\N€€  z . t x t  ªÚ38Ö "
    Pointless to even have "z . t x t " with so many spaces

    On a side note we live in a society which ignores wast fullness so in everything you see their is plenty of waste. Kind of like how if everyone did not throw away at least 2 grains of rice a day it be enough to feed every malnutrition starving person which is the main cause of health issues in the world, yet most throw away 1000 times more than that, even good quality since many confuse best by date to assume its expiration date which their are no expiration dates on food.

    The file compression programs dont let you set the library that small.
    Point is why should it be done?
    It can be depending on the file. I assume the bigger the file compression program is to store various methods then smaller the file.
    It would take more Hard drive space to have 1000 20 byes of file than 1 big file.
    The transfer rate would also get hurt greatly.

    Sure its possible to have a algorithm for something small but again no one would bother.
    If you intent to use that method on bigger file I am guessing it might be limited. But might be fun challenge find for someone to do.

  24. #17
    Member
    Join Date
    May 2020
    Location
    Berlin
    Posts
    26
    Thanks
    7
    Thanked 9 Times in 6 Posts
    Trench, I understand these more pragmatic and/or philosophical considerations. They could even lead to erasing the benefit of compressors like PAQ due to the computational costs.

    But here we're discussing decompressors, which ought to be used for older and curent platforms, with constraints for the actual code's size (like a program that fits into a 512 byte bootsector, maybe with less than that available for custom code). There are competitions for that.

  25. #18
    Member
    Join Date
    Jan 2020
    Location
    Fl
    Posts
    33
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Sorry. Yes True.
    But something like that wont be as versatile as the bigger ones. Kind of like having a Swiss army knife as a daily kitchen knife. Unless most of the code is in the file which that wont work, or it compressed itself. But it depends what one wants to decompress since one size cant fit all I assume.

    zip programs like all programs are like satellite programs relying on the main computer OS they run on. So the file is technically bigger. If you were to put it in another OS or older they wont function or be as small. So if the OS has tile files that help the zip program then you can get something smaller. So in a way its kind of cheating. What would a math professor come up with?

    But again its good to try for fun.
    Just an opinion.

  26. #19
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts
    Quote Originally Posted by Dresdenboy View Post
    No problem. Accumulating this information sounds useful. I also collected a lot of information and did some analysis both of encoding ideas for small executables and existing compressors. Recently I stumbled over ZX7 and related to it: Saukav. The latter one is cool as it creates a decompressor based on the actual compression variant and parameters. Before finding it I already deemed this a necessity to keep sizes small. Especially for tiny intros, where a coder could omit some of the generic compressor code (code mover, decompression to original address) to save even more bytes (aside from coding compression-friendly).

    Here is an example with some of the tested compression algorithms (sizes w/o decompressor stubs and other data blocks, e.g. in the PAQ archive file format), leaving out all samples with less than 5% reduction, as they might be compressed already:
    Click image for larger version. 

Name:	Figure_1_compression_ratios.png 
Views:	60 
Size:	74.3 KB 
ID:	7636

    Also interesting would be the total size incl. decompressor (not done yet). In this case we might just see different starting offsets (decompressor stub, tables etc.) on Y axis and different gradients with increasing X.
    Yes, I know about Saukav too. I did not have time to do detailed testing of it, but I understand what it does quite well and it should offer compression at the level of Pletter (likely somewhat better than Pletter), while being fairly compact. I believe that its approach to adaptivity, with multiple specific decompressors offered, is an excellent way to increase the compression "for free".

    However, I strongly suspect that a better solution must be available, most likely for 1K intros and definitely for 256b intros. I can explain what I mean as follows:

    Suppose you are working on a 1K intro that uses Saukav and at some point you reach the situation where the compressed intro together with decompressor uses up all available space. Suppose that the average decompressor length is 64 bytes (this is the size of zx7b decompressor - the origin of Saukav). Then your compressed size is 1024-64=960 bytes. I do not know the exact ratio offered by Saukav, so I'll use Pletter's ratio of 1.975 as a guide. Hence, our intro is actually 960*1.975=1896 bytes long. Let us now consider switching to a better compressor, e.g. Shrinkler, which is LZMA-based and compresses at the level similar to 7-zip. Its ratio on the same small file corpus that I use for many of my tests is about 2.25. Thus, compressed by Shrinkler our intro will become 1896/2.25~843 bytes long (I should be saying "on average", but it is very annoying to repeat "on average" all the time, so I assume it implicitly). We saved 960-843=117 bytes, which may sound great, yet in fact is useless. The shortest decompressor for Shrinkler on Z80 is 209 bytes long, so we saved 117 bytes in data, and added 209-64=145 bytes in decompressor, i.e. lost 28 bytes overall. The point is, when you are making a 1K intro, Shrinkler will lose to Saukav (to ZX7, MegaLZ, to basically any decent compressor with compact enough decompressor).

    When working with 4K of memory, these concerns become pretty much irrelevant, you can use Crinkler, Shrinkler or any other tool of your choice. But at 1K the ratio becomes less of a concern, as long as it is not completely dreadful, and decompressor size starts to dominate the proceedings. For 256b intros the situation is even more dramatic. I made one such intro using ZXmini (decompressor size 38 bytes) and found it (the decompressor) a bit too large. More can be done for sure.

    So, looking at your graph, I do not know what is your target size, but for anything <=1K trust me, without decompressor size included, this data is meaningless.

  27. #20
    Member
    Join Date
    Aug 2016
    Location
    USA
    Posts
    54
    Thanks
    13
    Thanked 20 Times in 15 Posts
    My experience being with x86 1K intros, this certainly resonates; at the end of the day, the tiny (de)compressor should only be helping you with code - all the data in the intro should be custom-packed anyway, in a way that makes it difficult to compress for LZ-based algorithms. For example, I remember using basically 2bits per beat for an audio track (2 instruments only, both OPL-2 synth); Fonts would be packed, etc.

    4K is different, I think there you just have a lot more room. And for 128B and 256B demos, compression is very unlikely to help, I think.

  28. #21
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,918
    Thanks
    291
    Thanked 1,274 Times in 720 Posts
    Windows has some preinstalled compression algorithms actually (deflate, LZX/LZMS): http://hugi.scene.org/online/hugi28/...20dropping.htm
    I wonder if same applies to other platforms? Maybe at least some relevant code in the ROM?

  29. #22
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts
    Quote Originally Posted by Dresdenboy View Post
    As you're mentioning Crinkler: I had some interesting discussion with Ferris, who created Squishy. He uses a small decompressor to decompress the actual decompressor. He said, the smaller one is about 200B. Then there is xlink, where unlord planned to have a talk at Revision Online 2020, but which has been cancelled. But there seems to be some progress, which he didn't publish yet. This might also be interesting to watch.

    BTW my smallest decompression loop (for small data sizes and only match length of 2) is 12B. Making it more generic for offsets "blows" it up to 17B. Typical LZ with multiple lengths is starting at ~20B depending on variant and assumptions. There likely are similarities to Stefan Atev's lost one. But I have to continue testing all those variants (with their respective encoders) to publish more about them.

    Another idea was to encode some x86 specific prefix. Such a prefix emitter can be as small as 15B.
    Frankly, I do not think Crinkler (or similar tools) are very relevant to this thread. You are right that there could be improvements to the decompressor, but I was trying to say that you won't get LZMA into sub 50-100b decompressor, so although an amazing tool for 4K or 8K intros, it is just a different kind of tool.

    Your idea to only have match length of 2 is cool (although I need to try it in practice to see how much ratio one loses in this case). The smallest generic LZ77 on Z80 that I know has an 18 byte decompression loop, so your 17 byte loop would be interesting to see - have you published it anywhere? In any case, I am working on a small article about such mini-decompressors and is definitely looking forward for anything you will write.

    I mainly code on Z80, so I do not know much about prefix emitters. Can you point to any discussion of what they can look like?

  30. #23
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts
    Quote Originally Posted by Dresdenboy View Post
    Adding to 1): 6502's ISA likely is too limited to do sth useful in <40B. BTW is there a comprehensive list of decompressor sizes and algos somewhere? You mentioned several ones in your Russian article.
    Adding to 2): Chances of being able to compete are diminishing quickly with less code. Compressed data size + decompressor might be an interesting metric. But I got an idea to do some LZ78 variant. Maybe it stays below 40B.
    1) I know some neat examples of Z80 decompressors. I am not aware of any systematic lists. I recently did some reverse-engineering of ZX Spectrum based 1Ks. About one third of them were packed; the most popular compressors seemed to be ZX7, MegaLZ and BitBuster (in order of reducing popularity, note that the respective decompressor sizes are 69, 110 and 88 bytes).
    2) Maybe yes, but the large influence of the decompressor size means that data format becomes a lot more important than usual. I think that this implies a lot of scope for adaptivity and tricks.

  31. #24
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts
    Quote Originally Posted by Stefan Atev View Post
    My experience being with x86 1K intros, this certainly resonates; at the end of the day, the tiny (de)compressor should only be helping you with code - all the data in the intro should be custom-packed anyway, in a way that makes it difficult to compress for LZ-based algorithms. For example, I remember using basically 2bits per beat for an audio track (2 instruments only, both OPL-2 synth); Fonts would be packed, etc.

    4K is different, I think there you just have a lot more room. And for 128B and 256B demos, compression is very unlikely to help, I think.
    I think that there are two approaches to making a compressed intro. First, more common one, would be to compress your well-tuned code so that a bit of extra squeeze can be achieved. This is very traditional strategy, but it is not the only one. Second strategy is to design data structures and also your code to help the compressor. E.g. often in a compressed intro a short loop can be replaced by a series of unrolled statements - an insane strategy in a size-optimized world, but quite possibly viable approach if you know that intro will be compressed. A complete paradigm shift is needed in this case, of course.

  32. #25
    Member
    Join Date
    Apr 2017
    Location
    United Kingdom
    Posts
    82
    Thanks
    62
    Thanked 33 Times in 21 Posts
    Quote Originally Posted by Shelwien View Post
    Windows has some preinstalled compression algorithms actually (deflate, LZX/LZMS): http://hugi.scene.org/online/hugi28/...20dropping.htm
    I wonder if same applies to other platforms? Maybe at least some relevant code in the ROM?
    I think some people made use of tricks like this. I have a lot of experience with older computers, for them data compression pretty much did not exist. I'd love to be proved wrong here, but I'd be very surprised if any of 1980s machines has anything of the kind in their ROMs.

  33. #26
    Member
    Join Date
    Aug 2016
    Location
    USA
    Posts
    54
    Thanks
    13
    Thanked 20 Times in 15 Posts
    Quote Originally Posted by introspec View Post
    E.g. often in a compressed intro a short loop can be replaced by a series of unrolled statements - an insane strategy in a size-optimized world, but quite possibly viable approach if you know that intro will be compressed. A complete paradigm shift is needed in this case, of course.
    I can see that, though it goes against my instincts I have seen people extract common instruction sequences into subroutines even if they were pretty arbitrary and logically unrelated; you eat 3 bytes to do a call (and one ret) each time you need the sequence, and that is basically an "executable LZ"; I can see how actual LZ would quickly be better since matches are encoded more efficiently than even near calls. However, for some data LZ is not that great, while a custom encoding could work quite well.

    None of the stuff I ever wrote assumed anything more than DOS, wouldn't have been able to depend on compression routines in Windows.

  34. #27
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,918
    Thanks
    291
    Thanked 1,274 Times in 720 Posts
    > wouldn't have been able to depend on compression routines in Windows.

    I actually mean chunks of existing code, like
    https://en.wikipedia.org/wiki/Return...amming#Attacks
    https://github.com/JonathanSalwan/ROPgadget#screenshots

  35. #28
    Member
    Join Date
    Aug 2016
    Location
    USA
    Posts
    54
    Thanks
    13
    Thanked 20 Times in 15 Posts
    Quote Originally Posted by Shelwien View Post
    > wouldn't have been able to depend on compression routines in Windows.

    I actually mean chunks of existing code, like
    https://en.wikipedia.org/wiki/Return...amming#Attacks
    https://github.com/JonathanSalwan/ROPgadget#screenshots
    I guess ROM is the only thing you can kind of depend on to be your "dictionary"; even then, you'd be making assumptions about the machine you're running on that may be a bit too specifc (e.g., all the 6502 clones I ever programmed were Pravetz 82 clones, definitely had their own ROM different than an Apple II); it's exactly like ROP exploits - they work on specific versions of OSes, not generically on a platform. Though maybe for non-malicious code, there's less risk of somebody patching what you depended on.

    People used to try to check the code pointed to by an interrupt handler to try to ensure that they are not running under debugger, or that certain TSRs are not installed, and that ultimately breaks when what you're checking legitimately changes...
    Last edited by Stefan Atev; 3rd June 2020 at 16:19. Reason: typos

  36. #29
    Member
    Join Date
    May 2020
    Location
    Berlin
    Posts
    26
    Thanks
    7
    Thanked 9 Times in 6 Posts
    Quote Originally Posted by introspec View Post
    Yes, I know about Saukav too. I did not have time to do detailed testing of it, but I understand what it does quite well and it should offer compression at the level of Pletter (likely somewhat better than Pletter), while being fairly compact. I believe that its approach to adaptivity, with multiple specific decompressors offered, is an excellent way to increase the compression "for free".

    However, I strongly suspect that a better solution must be available, most likely for 1K intros and definitely for 256b intros. I can explain what I mean as follows:

    Suppose you are working on a 1K intro that uses Saukav and at some point you reach the situation where the compressed intro together with decompressor uses up all available space. Suppose that the average decompressor length is 64 bytes (this is the size of zx7b decompressor - the origin of Saukav). Then your compressed size is 1024-64=960 bytes. I do not know the exact ratio offered by Saukav, so I'll use Pletter's ratio of 1.975 as a guide. Hence, our intro is actually 960*1.975=1896 bytes long. Let us now consider switching to a better compressor, e.g. Shrinkler, which is LZMA-based and compresses at the level similar to 7-zip. Its ratio on the same small file corpus that I use for many of my tests is about 2.25. Thus, compressed by Shrinkler our intro will become 1896/2.25~843 bytes long (I should be saying "on average", but it is very annoying to repeat "on average" all the time, so I assume it implicitly). We saved 960-843=117 bytes, which may sound great, yet in fact is useless. The shortest decompressor for Shrinkler on Z80 is 209 bytes long, so we saved 117 bytes in data, and added 209-64=145 bytes in decompressor, i.e. lost 28 bytes overall. The point is, when you are making a 1K intro, Shrinkler will lose to Saukav (to ZX7, MegaLZ, to basically any decent compressor with compact enough decompressor).

    When working with 4K of memory, these concerns become pretty much irrelevant, you can use Crinkler, Shrinkler or any other tool of your choice. But at 1K the ratio becomes less of a concern, as long as it is not completely dreadful, and decompressor size starts to dominate the proceedings. For 256b intros the situation is even more dramatic. I made one such intro using ZXmini (decompressor size 38 bytes) and found it (the decompressor) a bit too large. More can be done for sure.
    We could use "~" anywhere for "on average" (some thread compression ).

    Saukav is probably just a one encoding format example (zx7) of an actually global optimization problem (any compression method + decomp stub size on target platform + decomp stub adaptions + other adaptions). I'm doing a lot of analysis of x86 intros (which methods work best, could they be stripped down, etc.). For example (and you surely saw that when doing your own small decompressors), a more or less generic packer uses a lot of encodings, code paths etc. to be good overall. But for small files (and there's also only little research in this regard, e.g. about compressed message sizes of eth protocol etc.) some specific subset with very little case handling might fit better. In small files (like 256B intros) there are a lot of 1B (short opcodes) and many 2B matches, but very few 3+ matches. So any complex handling of the latter ones, like interleaved gamma codes etc. might be just too much. Also the probabilities of literal run lengths look very different here. That's why I also think there is room for improvement.

    So, looking at your graph, I do not know what is your target size, but for anything <=1K trust me, without decompressor size included, this data is meaningless.
    This was more about being platform agnostic, to first find the "overhead" of algorithms for finding their models, tune to the data, or simply encode literals and matches. Then would follow the next step of adding the decomp stub sizes, as COM might mean 16B, "low" memory (on PC at least, with 40+B for accessing >1MB).

    Some current list of tested compressors with example compressed sizes:
    Code:
    lzw:     227/ 256 B
    puc:     221/ 256 B
    exo:     201/ 256 B
    zx7:     231/ 256 B
    hrust:   223/ 256 B
    pkf:     221/ 256 B
    pkft:    227/ 256 B
    muc:     251/ 256 B
    apc:     214/ 256 B
    apack:   217/ 256 B
    paq1:    189/ 256 B
    lzma:    292/ 256 B
    zlib:    246/ 256 B
    gzip:    258/ 256 B
    bz2:     288/ 256 B
    lz4:     261/ 256 B
    paq1ss:  188/ 256 B
    paq3n:   188/ 256 B
    paq6v2:  188/ 256 B
    paq8f:   185/ 256 B
    paq8h:   194/ 256 B
    paq8l:   204/ 256 B
    paqar:   179/ 256 B
    If you have some interesting ones, I'd happy to add them. So far I also want to add oneKpaq, as it has a 128B decompressor.

    More on the other topics later. E.g. prefix emitter: on x86 I might try to use long SSE insts. They share a 0x0f prefix. This is easy to emit with little code. On Z80 there are several prefixes (FD, ED etc.), which are rather common. It would be easy to write such a compressor with encoding the prefixes in 1-2 bits.
    Last edited by Dresdenboy; 5th June 2020 at 08:05. Reason: wrong SIMD

  37. Thanks:

    introspec (10th June 2020)

  38. #30
    Member
    Join Date
    May 2020
    Location
    Berlin
    Posts
    26
    Thanks
    7
    Thanked 9 Times in 6 Posts
    I found my latest version of the prefix emitter. It's even just 11B in the main loop (x86, 16b) without proper setup. Does the Z80 have bitfield instructions? I guess not. I programmed them the last time 28 years ago. Then the typical shift, test, reload would be required.

    Addendum:
    Some further research brought up these LZ based solutions (68k, 28-42B):
    http://eab.abime.net/showthread.php?...72#post1148772 (ross)
    http://eab.abime.net/showthread.php?...32#post1152832 (paraj)
    http://eab.abime.net/showthread.php?...66#post1270966 (Blueberry)
    https://www.pouet.net/topic.php?whic...page=1#c276543 (baah, who also has a 32B variant in some productions)

    I will add lzsa, lz32b (ross' encoder), lz48, emcompress/smashv2 to my list soon. I will also start adding decomp sizes to the calculations when available.
    Last edited by Dresdenboy; 7th June 2020 at 03:12.

  39. Thanks:

    introspec (10th June 2020)

Page 1 of 2 12 LastLast

Similar Threads

  1. Seeking extremely easily non secure pwd hash
    By SvenBent in forum Data Compression
    Replies: 8
    Last Post: 3rd September 2016, 12:00
  2. Extremely fast hash
    By Bulat Ziganshin in forum Data Compression
    Replies: 36
    Last Post: 23rd August 2013, 21:25

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •