View Poll Results: What should I release next?

Voters
16. You may not vote on this poll
  • Improved LZ4X

    3 18.75%
  • Improved ULZ

    3 18.75%
  • Improved ULZ with a large window

    10 62.50%
Page 1 of 4 123 ... LastLast
Results 1 to 30 of 101

Thread: LZ4X - An Optimized LZ4 Compressor

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts

    LZ4X - An Optimized LZ4 Compressor

    Okay, this is the very first release of my optimized version of the LZ4 algorithm. It is compatible with current LZ4 versions (LZ4 Legacy frame). However, in future versions I'd like to remove the decoder restrictions like "LAST_LITERALS" and add support for a larger block (this won't break backward compatibility with the current Legacy format).

    For those of you who are not familiar with the LZ4 - this program developed for a fast compression with the fastest and simplest decoder. The goal of the LZ4X is to produce the smallest .LZ4 files possible (c4 level) being compatible with the original LZ4!

    The program can be found at: https://github.com/encode84/lz4x

    Enjoy new release!


  2. Thanks (9):

    comp1 (9th February 2016),Cyan (9th February 2016),DirtyPunk (17th February 2016),jibz (9th February 2016),Matt Mahoney (9th February 2016),Mike (9th February 2016),spark (28th February 2016),Turtle (11th February 2016),Zhabaloid (17th September 2016)

  3. #2
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,026
    Thanks
    103
    Thanked 410 Times in 285 Posts
    Used same HTML file as in big test some days ago:

    Input:
    30,720 bytes HTML

    Output:
    12,589 bytes, 0.014 sec - 0.009 sec., LZ4Opt cf (0.005 sec. - 0.001 sec. reported)
    11,902 bytes, 0.023 sec - 0.009 sec., LZ4Opt cb (0.013 sec. - 0.000 sec. reported)

  4. Thanks:

    encode (9th February 2016)

  5. #3
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Nice. I posted your results. http://mattmahoney.net/dc/text.html#3721

  6. Thanks:

    encode (9th February 2016)

  7. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Thanks a lot!

  8. #5
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Any chance that you release the sources?

  9. #6
    Member
    Join Date
    Sep 2010
    Location
    US
    Posts
    126
    Thanks
    4
    Thanked 69 Times in 29 Posts
    FYI

    Oodle LZB16 (very similar to LZ4)

    enwik8 100,000,000 ->41,936,109 = 3.355 bpb = 2.385 to 1

    encode 17.493 seconds

    Obviously different machine so speeds not directly comparable, but similar ballpark. So, you've got it right!
    Last edited by cbloom; 3rd August 2016 at 20:01.

  10. Thanks:

    encode (10th February 2016)

  11. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Can't say about the source right now - since the program is still not finished. I'm talking about the Brute mode - I have to do something with the match finder - the one I use right now goes nuts with these unlimited match lengths - with some highly redundant files it may nearly freeze.
    I pretty confident with Fast/Normal compression modes right now though.

    Great honor and pleasure to see you here, Charles!
    As a note, a few things may slightly improve compression here:
    • If get rid of LAST_LITERALS decoder restriction. If do so, files won't be unpacked by current versions of LZ4. My decoder has no such restrictions, by the way. The compression improvement is couple of bytes.
    • Using the full range of match distances will save us a few bytes for sure. And again this will break LZ4 compatibility completely.
    • A better match finder may improve compression for a couple of bytes, again.
    • A larger block, say, 16 MB, will lead us to more noticeable compression improvement on ENWIK8/9.
    • I'm pretty confident with the current optimal path builder (string parser, whatsoever), however, it is possible to further improve it - again you win a couple of bytes at best.


  12. #8
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    412
    Thanks
    38
    Thanked 64 Times in 38 Posts
    at first thank you very much for your new program lz4opt

    is here a problem with the "cb" - feature? ...

    if i compress a file (61 GBytes , database-backup) with mode -c then the compressing needs not a long time
    but with mode -cb it now works 25 hours (within taskmanager visible - it use CPU 30 % ...) but no progress is visible ...

    ***
    c:\_2016-01>lz4opt
    Optimized LZ4 Compressor, v1.00
    Copyright (C) 2016 Ilya Muravyov

    Usage: lz4opt command infile outfile

    Commands:
    c Compress (cf = fast, cb = brute)
    d Decompress
    ***

    c:\_2016-01>lz4opt c backup.dat back-c
    backup.dat: 61781575680->16229966648 in 937.260s

    c:\_2016-01>lz4opt cb backup.dat back-cb
    backup.dat:
    ***

    best regards

    PS: what do you think about optimizing speed for such old compressing-algorithms like pbzip2 ?
    last version 1.1.13 is from 2015-12-18: https://launchpad.net/pbzip2/1.1/1.1...-1.1.13.tar.gz

    in our days we have 4 cores or 8 cores or more in a pc and not many algorithms can using good so much cores ...

  13. #9
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    As to "cb" mode, look at my notes above. I call it "Brute" to warn users about its "speed". It's all about getting the best compression possible within given compression format (keeping the decoder compatibility). No matter how much time it will consume. Although, match finder must be improved here for sure. And yes, a progress indicator is pretty handy, but, hey, it was initial release!

  14. #10
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    How about suffix array match finder?

  15. #11
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,572
    Thanks
    780
    Thanked 687 Times in 372 Posts
    Quote Originally Posted by m^2 View Post
    How about suffix array match finder?
    why it should be better?

  16. #12
    Member
    Join Date
    Sep 2010
    Location
    US
    Posts
    126
    Thanks
    4
    Thanked 69 Times in 29 Posts
    My LZB16 optimal parse uses a suffix array.

    For the specific parameters of :

    1. small window (64k in this case)
    2. find the absolute longest match length in that window (not approximate)
    3. find matches at every position (optimal parse, not greedy)

    in my tests suffix array was the fastest.

    Other algorithms all have terrible N^2 blowups unless you limit their search depth, which means they aren't finding the longest matches.

    Getting the suffix array searcher right is not trivial. You do something like build suffix arrays on 128k byte chunks that overlap by 64k. When you string search, you need to rule out strings that are ahead of the current position, that has to be done right or it ruins the speed.

    Many blog posts on this topic :

    http://cbloomrants.blogspot.com/2012...sion-tree.html

    http://cbloomrants.blogspot.com/2011...t-release.html

    http://cbloomrants.blogspot.com/2011...th-suffix.html

    http://cbloomrants.blogspot.com/2011...ts-part-6.html

    http://cbloomrants.blogspot.com/2010...ring-pair.html

    http://cbloomrants.blogspot.com/2011...ndex-with.html


    In hindsight, personally I wouldn't bother with this. I'd just use a simpler matcher and accept that it doesn't get you the absolute best matches.
    Last edited by cbloom; 3rd August 2016 at 20:01.

  17. Thanks (4):

    Bulat Ziganshin (12th February 2016),Cyan (11th February 2016),DirtyPunk (17th February 2016),encode (12th February 2016)

  18. #13
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    412
    Thanks
    38
    Thanked 64 Times in 38 Posts
    @encode: thank you for your comment

    now the program works 31 hours and it seems really to work on progress ...
    the outputfile is now bigger then 7 hours before

    best regards

  19. #14
    Member
    Join Date
    Sep 2010
    Location
    US
    Posts
    126
    Thanks
    4
    Thanked 69 Times in 29 Posts
    In theory, the sliding-window Suffix Tree would work well, but in practice that algorithm is too complicated and a fast implementation has never been done. I have a fast suffix tree, but it's non-sliding, and the current sliding suffix tree implementations are much slower than suffix array.

    ADD:

    I just uploaded my string match stress test files. Post about it here :

    http://www.cbloom.com/rants.html

    http://www.cbloom.com/misc/string_match_stress_tests.7z

    The current LZ4Opt cb mode does get caught out on a few of these files (unsurprisingly). Anyway, maybe it will be useful for other string match developers to test on these stress cases.
    Last edited by cbloom; 3rd August 2016 at 20:01.

  20. Thanks (3):

    Cyan (12th February 2016),encode (12th February 2016),Turtle (12th February 2016)

  21. #15
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    412
    Thanks
    38
    Thanked 64 Times in 38 Posts
    @encode:
    first testresult with my testfile:
    lz5 Version 1.4 with mode -13 compresses better and faster then the actual lz4opt in mode -cb

    c:\_2016-LZ5>lz5 -13 backup.dat bac-lz5-13
    Compressed 61781575680 bytes into 10748535546 bytes ==> 17.40%

    compressing time for lz5 is 4 hours

    keep it working
    best regards

  22. #16
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Quote Originally Posted by joerg View Post
    @encode:
    lz5 Version 1.4 with mode -13 compresses better and faster then the actual lz4opt in mode -cb
    And again you test "Brute" mode on speed... Moreover, you compare apples to oranges here. Compare lz4opt with lzma then. Don't be mistaken by its name - lz5 has nothing in common with lz4. Or else I can call the lzma modification of lz4. A larger dictionary is extremely important with these byte-aligned/no huffman LZ77 coders. And even Greedy LZ77 with a larger window may beat an Optimal LZ with a smaller one...

  23. #17
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    @joerg: try "lz5 -13 -B7 backup.dat bac-lz5-13-b7"

    Quote Originally Posted by encode View Post
    lz5 has nothing in common with lz4. Or else I can call the lzma modification of lz4.
    LZ5 -13 in comparison to LZ4 has:
    - different codewords
    - a larger window
    - an optimal parser
    But is not LZ4+LZMA. Maybe it's in the middle between LZ4 and LZMA. LZMA has binary flags, arithmetic coding, and uses contexts.

  24. #18
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    I use a suffix array (libdivsufsort) to find matches in zpaq -method 2. It quickly finds the longest match by scanning forward and back for the first non-future match and taking the longer of the two. But you also need an inverse suffix array to find where to start the scan. To save memory (4.5x block size instead of 8x), I build just 1/8 of the ISA 8 times.

  25. #19
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    412
    Thanks
    38
    Thanked 64 Times in 38 Posts
    results for lz4opt:

    c:\_2016-01>lz4opt c backup.dat back-c
    backup.dat: 61781575680->16229966648 in 937.260s

    c:\_2016-01>lz4opt cb backup.dat back-cb
    backup.dat: 61781575680->15251536527 in 173665.596s

    the program works with -cb correctly , but the time ..

  26. Thanks:

    encode (12th February 2016)

  27. #20
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Working on a new release. Not really happy with the current program name, so:
    Code:
    Optimized LZ4 Compressor, v1.01
    Copyright (C) 2016 Ilya Muravyov
    
    Usage: LZ4X command infile [outfile]
    
    Commands:
        c Compress (ch=High, cx=Extreme)
        d Decompress


    My current goal is to improve the match finder. And, I'm planning to simultaneously upgrade the CRUSH's match finder as well.

  28. Thanks:

    Cyan (19th February 2016)

  29. #21
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    342
    Thanks
    197
    Thanked 58 Times in 42 Posts
    Quote Originally Posted by encode View Post
    Working on a new release. Not really happy with the current program name, so:
    Code:
    Optimized LZ4 Compressor, v1.01
    Copyright (C) 2016 Ilya Muravyov
    
    Usage: LZ4X command infile [outfile]
    
    Commands:
        c Compress (ch=High, cx=Extreme)
        d Decompress


    My current goal is to improve the match finder. And, I'm planning to simultaneously upgrade the CRUSH's match finder as well.
    Great news Ilya! Can't wait for both updated versions to be released!

  30. Thanks:

    encode (19th February 2016)

  31. #22
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Have tested many approaches and kept the Binary Tree match finder for Extreme mode. Compared to initial release, this is a night and day difference. Speed improvement is about 3X to 10X in average in pair with a slightly higher compression!
    For Normal (default) and High modes I kept the Hash Chain match finder. New Normal mode is as fast as Fast mode in LZ4Opt with a higher compression. The brand new High mode is kept for reference as the best Greedy coder.



    New result for enwik9 is 372,073,904 bytes.


  32. #23
    Member lz77's Avatar
    Join Date
    Jan 2016
    Location
    Russia
    Posts
    127
    Thanks
    39
    Thanked 13 Times in 9 Posts
    Hi Ilya, does your lz77 compressor uses only one thread in fast mode? Serge

  33. #24
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Quote Originally Posted by lz77 View Post
    Hi Ilya, does your lz77 compressor uses only one thread in fast mode? Serge
    Yep, the entire program is single-threaded. For super-fast, multi-threaded compression/decompression you may use the original LZ4. My program is intended to crunch the last byte from the compressed data.

  34. #25
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    For hash functions discussion, please use dedicated threads, like: http://encode.su/threads/62-Knuth-s-...cative-Hashing

    BTW, the LZ4X v1.01 is now available!

    Look at the first post and CompressMe.net

  35. Thanks (2):

    comp1 (24th February 2016),spark (28th February 2016)

  36. #26
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Have tested some LZ4 modifications:
    No code has to be inserted here.
    LZ4 with 128KB window (we grab one bit from the literal run field) looks pretty good and IMHO it is the best candidate for LZ4v2 (or how we should call it?)
    Or, I can release my own program e.g. ULZ as the LZO/LZ4 competitor.

  37. #27
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    BTW, as with LZO, Match Distance if 0 could be EOF

  38. #28
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    The token byte could look like:

    rrr o llll (3+1+4 bits)

    rrr - literal run length

    o - the highest bit of a match distance (17-bit window)

    llll - match length as usual


  39. #29
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Why does 128 look better than 256 to you?
    I'd try to lengthen it still.

  40. #30
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Quote Originally Posted by m^2 View Post
    Why does 128 look better than 256 to you?
    I'd try to lengthen it still.
    In some scenarios, even 128K window might be too large (small chunk compression).
    However, as a general rule, larger window (dictionary) = better compression, especially if we talking about byte-aligned LZ coders like LZ4 or some versions of LZO. At the same time, and at some point we must start using a variable length codes for offset (match distance) coding. And IMO 64K-256K is the limit if we code distance in fixed bits. In addition, the bigger the dictionary, the slower the compressor.
    I have tested many LZ4 modifications, and came up with the Enhanced LZ4 - LZ4 with 256 KB window (vs 64 KB) - LZ4X or LZ4v2
    It's like the Deflate vs Enhanced Deflate (Deflate64).
    Actually, everything is pretty the same as with original LZ4 legacy frame (the same file extention .lz4), except:
    New magic number - possibly "LZ4X"
    Block size is 16 MB
    Match distance of 0 is either unused or represents the EOF marker
    The same file structure: compressed size, compressed data, etc.
    The main difference - window size = 256 KB
    To make this possible we modify the "Token" byte as follows:

    rrr oo lll

    3 highest bits represents literal run length (7 means read one extra byte etc., 255 means read another one etc.)

    2 middle bits represents the highest bits of a match distance

    3 lowest bits represents a match length (7 means read one extra byte etc.)

    With such Token structure we may efficiently extract fields:

    int Token=GET_BYTE();
    if (Token>=32)
    {
    int LiteralRun=Token>>5;
    // ...

    int MatchPos=Pos-((Token&0x18)<<13);
    MatchPos-=GET_BYTE();
    MatchPos-=GET_BYTE()<<8;
    if (MatchPos==Pos) // EOF
    // ...

    int MatchLen=Token&7;
    // ...

    To be continued...

Page 1 of 4 123 ... LastLast

Similar Threads

  1. M1 - Optimized demo coder
    By toffer in forum Data Compression
    Replies: 190
    Last Post: Yesterday, 02:06
  2. LZF - Optimized LZF compressor
    By encode in forum Data Compression
    Replies: 39
    Last Post: 28th March 2019, 19:49
  3. Optimized LZSS compressor
    By encode in forum Data Compression
    Replies: 11
    Last Post: 13th February 2014, 22:51
  4. lzop optimized compile
    By M4ST3R in forum Download Area
    Replies: 1
    Last Post: 30th June 2009, 21:31
  5. 7zip >> Sfx optimized - 23,7 kb
    By Yuri Grille. in forum Data Compression
    Replies: 21
    Last Post: 12th April 2009, 21:33

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •