Page 1 of 2 12 LastLast
Results 1 to 30 of 36

Thread: Fast LZ compression

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts

  2. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Currently I use next LZ-output coding scheme:

    <flags> - 1 byte, represents eight flags
    <eight lz units> - 8 bytes, literals and/or matches

  3. #3
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Here I also use the fast version of Flexible Parsing (which can be disabled). Most likely that THOR uses a Greedy Parsing only.


  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Check out some code:

    #define N (1 << 24) // we use 16m buffer, actually we can use buffer of any size
    #define X (1 << 16) // we can hold an order-2 context, or a 16-bit hash value

    // like with all my programs:
    static unsigned char buf[N]; // buf[] - a global variable, holds a block of uncompressed data
    static int tab[X]; // tab[] - an index table, indexes actual buffer position by context



    // the main and only function for match search. very simple, isn't it?

    inline int getmatch(int i, int n) {
    int x = *(reinterpret_cast<unsigned short *>(&buf[i - 2]));

    int p = tab[x];

    int end = i + MAXMATCH;
    if (end > n)
    end = n;

    int j = i;
    while ((buf[p++] == buf[j]) && (++j < end));

    return (j - i);
    }

    // the main routine
    static void encode() {
    int n;
    while ((n = fread(&buf, 1, N, f)) > 0) { // load the buffer
    memset(&tab, 0, sizeof(tab)); // reset the table

    int i = 0;
    for (; (i < 2) && (i < n); i++) // seed some raws to establish a context
    output_literal(buf[i]);

    while (i < n) {
    int maxlen = getmatch(i, n);

    if ((maxlen >= MINMATCH) && (i < (n - maxlen))) {
    int len = maxlen;
    int maxvalue = len + getmatch((i + len), n);

    if (maxvalue < ((MAXMATCH * 2) - 1)) { // optimal parsing, it this case it is really optimal!!
    for (int j = 1; j < len; j++) {
    int temp = j + getmatch((i + j), n);

    if (temp > maxvalue) {
    maxvalue = temp;
    maxlen = j; // optimize match length
    }
    }
    }
    }

    int x = *(reinterpret_cast<unsigned short *>(&buf[i - 2]));
    tab[x] = i;

    if (maxlen >= MINMATCH) {
    output_flag(1);
    output_len(maxlen - MINMATCH);
    i += maxlen;
    }
    else {
    output_flag(0);
    output_literal(buf[i]);
    i++;
    }

    }
    }
    }

  5. #5
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,501
    Thanks
    741
    Thanked 664 Times in 358 Posts
    Quote Originally Posted by encode
    There is a few ways to encode LZ-output:
    + Byte-aligned, using flags. For example, 0xB5 can reperesent the match flag;
    + Byte-aligned, using bit flags. As with LZRW - first byte represent eight flags, after, eight units - literals/match lengths;
    + Bit I/O. Literals are encoded with fixed number of bits (, Match lengths can be coded via variable length codes;
    + Huffman coding;
    first way is probably too expensive - you should have large amount of matches with fast LZP. second is LZSS one, known from ~1982, named bytecode in TOR. 3rd also supported in TOR, 4th is a bit too complex, so i dont implemnted it too. instead, i added very fast aricoder

    ive aleady implemented lazy parsing but it turns out to be too slow for -1..-4 modes (i.e. comparable with THOR). i hardly imagine that using flexible parsing you can get THOR-comparable speeds

    if you interested, i will explain you how to use TOR parts to construct your own compressor. in particular, as you can see from my program, its enough to issue Encode() calls to encode matches/literals - all else is done by byte/bit/aricoders. although i prefer to see lgpled one as a result, otherwise it will be nothing better than closed-source thor

  6. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,501
    Thanks
    741
    Thanked 664 Times in 358 Posts
    >optimal parsing, it this case it is really optimal!!

    afais, it's exactly flexible parsing. in optimal parsing you walk from end to begin after you've found matches for all positions in block

  7. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    I posted just basics. Authors like Bulat and Oscar can bring some ideas from it. If anyone have questions - please ask. I noticed that I written too many compressors!

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by Bulat Ziganshin
    afais, its exactly flexible parsing. in optimal parsing you walk from end to begin after youve found matches for all positions in block
    In case with static encodings, Flexible Parsing achieves the optimality - just different way to do the same thing...

  9. #9
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,501
    Thanks
    741
    Thanked 664 Times in 358 Posts
    Quote Originally Posted by encode
    Here I also use the fast version of Flexible Parsing
    yes, it is flexible parsing, but why you call it "fast"?

    try your code. i guess it will be slower than thor ex or tor -4

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by Bulat Ziganshin
    yes, it is flexible parsing, but why you call it "fast"?
    With LZP it works fast. Anyway, you can disable it.

  11. #11
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,501
    Thanks
    741
    Thanked 664 Times in 358 Posts
    Quote Originally Posted by encode
    In case with static encodings, Flexible Parsing achieves the optimality - just different way to do the same thing...
    no. actually, original optimal parsing by SS and in quark was about minimizing number of matches, not total price. flexible parsing here also tries to minimize number of matches, but does it locally - its the difference between flexible and optimal scheme

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    With stuff like LZP it is optimal...

    Don't forget there is no LZH!

  13. #13
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,501
    Thanks
    741
    Thanked 664 Times in 358 Posts
    Quote Originally Posted by encode
    I posted just basics. Authors like Bulat and Oscar can bring some ideas from it

  14. #14
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts

  15. #15
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    To be honest, I'm just experimenting with my own EXE-packer, based on this scheme. Using Pascal, the size of DOS EXE-stub is about 2 KB!

  16. #16
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,501
    Thanks
    741
    Thanked 664 Times in 358 Posts
    Quote Originally Posted by encode
    With stuff like LZP it is optimal...
    try to prove it. imho, looking 2 "moves" ahead instead of one isnt the same as parsing from end. dont forget that optimal parsing was invented for static lzss encoding, not lzh. pricing scheme born in lzx, afaik

  17. #17
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,501
    Thanks
    741
    Thanked 664 Times in 358 Posts
    Quote Originally Posted by encode
    With LZP it works fast. Anyway, you can disable it.
    lzp+fp will be slow beast. as i already said, even lazy matching doesnt work good in thor-comparable fast modes. and my implementation of lazy matching doesnt reevaluate the same match twice

    so now you should get slow and bad-comperssion tool. you can make it fast by disabling FP and making hash 2-8 times less that L2 cache in your processor. what next? i think that adding aricoder (its veru easy to reuse my one) and use, say, 4-3-2 LZP scheme for searching matches. but there are huge number of other variants, and you will need to test them all, as Oscar done

    as i already said, i will be glad to see results of your work in form of lgpl code that may be used in freearc and other archivers

  18. #18
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Okay, the authors of FP, said that it is optimal, at least with LZW. Like with LZP with LZW we cannot parse strings from the end - the dictionary builds during compression. The same thing with LZP. Read the FP paper about theoretical and other proves...

  19. #19
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Okay, give me the time... Like I said the main interest in this scheme is simplicity. It's for EXE-unpacker-stub. I think, soon, I'll share with some executable which packed with this compression...

    In addition, the stub for DOS uses less than 64 KB of memory...

  20. #20
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts


    Well, as Bulat said I do NOT use optimal parsing. It can be said that thor acquires some kind of learning ( )

  21. #21
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    Quote Originally Posted by encode
    To be honest, Im just experimenting with my own EXE-packer, based on this scheme. Using Pascal, the size of DOS EXE-stub is about 2 KB!
    Regarding upacks readme.txt Dwing managed to crunch LZMA decompression to 447 bytes. I think it would be a challenge to outperform this And, AFAIK MEW 11 has tiniest deflate-like decompression code, its much smaller than upacks

  22. #22
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Oscar
    Well, as Bulat said I do NOT use optimal parsing. It can be said that thor acquires some kind of learning ( )

  23. #23
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    THOR with ef decompresses ENWIK9 within 53 sec. LZPM 0.02 within 59 sec.

    But compression...

    So, currently my main interest in super-small LZ compressors for my EXE-packer.

    Just experimenting...

  24. #24
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by encode
    So, currently my main interest in super-small LZ compressors for my EXE-packer
    Cant stop drooling...

  25. #25
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Okay, some tetsing results with ENWIK8. Note that this LZP is heavily I/O bounded.

    16 MB buffer, pure order-2, greedy parsing:
    74,353,061 bytes, 2.797 sec

    16 MB buffer, pure order-2, flexible parsing:
    73,908,904 bytes, 3.766 sec

    16 MB buffer, hashed order-3, 20-bit hash, greedy parsing:
    69,605,454 bytes, 5.453 sec

    16 MB buffer, hashed order-3, 20-bit hash, flexible parsing:
    68,966,857 bytes, 7.656

    For comparison:

    THOR 0.94, ef:
    55,138,788 bytes, 2.672 sec

    TORNADO 0.1, -1:
    60,063,659 bytes, 2.562 sec

    A few notes. First of all, currently I use just pure byte output:

    Literal:
    0xXX - literal

    Match:
    0xB5 - flag
    0xXX - match length

    For simplicity and speed. Secondly, on some files this engine provides some interesting results:

    fp.log:
    LZP: 1,637,420 bytes
    TORNADO, -2: 1,743,480 bytes
    THOR, e: 1,780,864 bytes
    THOR, ef: 2,047,060 bytes
    TORNADO, -1: 2,269,331 bytes

    Anyway, this is only draft version. I think with some optimizations this one can the fastest thing. However, I'm just experimenting...


  26. #26
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Will you also be releasing this as a general ultra-fast file compressor?

  27. #27
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Currently, IMHO, it is not worth it. It needs to be improved... Better look at QUAD-SFX DEMO!

  28. #28
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 153 Times in 18 Posts
    I've been developing an ultra-fast compressor, too. I call it SLZ - I know it's lame.

    Currently, SLZ has the following features:
    -huffman coder
    -simple LZ
    -512K non-sliding window

    Asynchronous IO is not done, yet. I'll try to add this if I find some time. Here are some first results taken with timer 3.01:

    <u>Compression:</u>
    ENWIK8:
    THOR ef: -> 55138792 in 1.406 seconds
    SLZ: -> 46037392 in 1.395 seconds

    MC (zip, stored):
    THOR ef: -> 19562288 in 0.562 seconds
    SLZ: -> 19366272 in 0.546

    <u>Decompression to nul:</u>
    ENWIK8:
    THOR ef: 2.062 seconds
    SLZ: 1.171 seconds

    MC (zip, stored):
    THOR ef: 0.921 seconds
    SLZ: 0.640 seconds

    Funny thing is, although, SLZ is asymmetric, most of the time decompression speed is slower than compression speed. My damn huffman decoder isn't as fast as the encoder. Well it's still bloody fast.
    But I don't think I can improve speed further - huffman is already a bottleneck. I'll release something in the next weeks. So you speed freaks have one more tool to play with.

  29. #29
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,501
    Thanks
    741
    Thanked 664 Times in 358 Posts
    is it epidemic?

  30. #30
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Looks very promising. Can you tell what's inside? (Some details about algo)

Page 1 of 2 12 LastLast

Similar Threads

  1. Zhuff - fast compression
    By Cyan in forum Data Compression
    Replies: 38
    Last Post: 5th February 2014, 11:27
  2. Fast arithcoder for compression of LZ77 output
    By Bulat Ziganshin in forum Forum Archive
    Replies: 13
    Last Post: 15th April 2007, 18:40

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •