View Poll Results: The default window size of CRUSH should be:

Voters
16. You may not vote on this poll
  • 512 KB

    1 6.25%
  • 1 MB

    4 25.00%
  • 2 MB

    11 68.75%
Page 1 of 2 12 LastLast
Results 1 to 30 of 50

Thread: CRUSH 0.01 is here!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts

    Exclamation CRUSH 0.01 is here!

    Hi all!

    Please welcome the CRUSH - my new LZ77 encoder designed for extremely fast decompression, while maintaining a good compression ratio.

    Check out the LTCB results:

    ENWIK8   
     Comp.SizeComp.TimeDecomp.Time
    crush cf37,401,090 bytes2.324 sec0.452 sec
    crush c33,618,865 bytes15.553 sec0.421 sec
    crush cx32,577,338 bytes61.995 sec0.405 sec
    ENWIK9   
     Comp.SizeComp.TimeDecomp.Time
    crush cf330,975,986 bytes21.216 sec4.212 sec
    crush c297,103,092 bytes129.480 sec3.915 sec
    crush cx287,333,602 bytes532.011 sec3.790 sec

    Decompressing ENWIK9 within 3 secs?!

    System specs:
    Windows 7 Ultimate SP1
    Intel Core i7-2600 @ 3.4 GHz
    Kingston 8GB DDR3
    WD VelociRaptor 10000RPM 600GB SATA3

    As you can see, the CRUSH has three compression modes:

    Fast:

    crush cf enwik8 enwik8.cru

    Designed for fast compression

    Normal:

    crush c enwik8 enwik8.cru

    Designed for good balance between compression ratio and speed

    Max:

    crush cx enwik8 enwik8.cru

    Designed for maximum compression!



    I hope you'll enjoy my new program!

    EDIT: Uploaded a Linux compile. Note, this one is slower than Windows compile!
    Attached Files Attached Files

  2. #2
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    X-Firefox 4 directory:
    programsizedecompression time(ms)
    crush cx17660600390
    LZMH -mx915269587281
    LZOP -919105149218
    LZ4HC20361000109
    LZHAM -m4 -x -e -d23146773371281

  3. #3
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    Thanks a lot!

  5. #5
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    858
    Thanks
    450
    Thanked 255 Times in 103 Posts
    Wow, LZMH looks like a very strong contender in this category too.
    Just out of curiosity, would you mind probing how ZhuffMax could compare with these compressors, from a decompression perspective, please ?

    Rgds
    Last edited by Cyan; 18th May 2011 at 01:27.

  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    As a note, the decompression speed of CRUSH also depends on data type. And it's not that fair actually to compare LZOP and LZ4HC ("High Compression"), since ENWIK9 results are:

    LZ4HC -> 392,102,544 bytes
    LZOP -> 366,349,786 bytes
    CRUSH -> 287,333,602 bytes

    CRUSH is from different opera. Otherwise, you may add the LZMA to comparison.

    BTW, where I can get LZMH?

  7. #7
    Member Surfer's Avatar
    Join Date
    Mar 2009
    Location
    oren
    Posts
    203
    Thanks
    18
    Thanked 7 Times in 1 Post
    Quote Originally Posted by encode View Post
    BTW, where I can get LZMH?
    http://encode.su/threads/1117-LZHAM?...ll=1#post22513

  8. #8
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Cyan View Post
    Wow, LZMH looks like a very strong contender in this category too.
    Just out of curiosity, would you mind probing how ZhuffMax could compare with these compressors, from a decompression perspective, please ?

    Rgds
    Sure.
    programsizedecompression time(ms)
    crush cx17660600390
    LZMH -mx915269587281
    LZOP -919105149218
    LZ4HC20361000109
    LZHAM -m4 -x -e -d23146773371281
    ZhuffMax18053092250
    Honestly, I have no idea what's up with LZMH.
    It's usually a clear leader in my tests, but that's just crazy.

  9. #9
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I decided to do another test. 1/3 of file size is text.
    ffmpeg.
    programsizedecompression time(ms)
    crush cx13815247359
    LZMH -mx911271716281
    LZOP -919307657281
    LZ4HC20627230140
    ZhuffMax16137103328

    Note: in both tests all files got the data tarred, so LZMH probably doesn't use it's exe filters.
    Last edited by m^2; 18th May 2011 at 11:02.

  10. #10
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    858
    Thanks
    450
    Thanked 255 Times in 103 Posts
    Thanks for the tests.
    Indeed, LZMH results are unexpectedly strong in this area.
    That's very insteresting, because it provides a useful comparison point.

    What kind of tool are you using to measure decompression time ? Something like Igor's timer program ?

    Regards

  11. #11
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Exactly Igor's timer.
    I used user times for this thread.

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    I decided to make CRUSH an Open Source project (Public Domain at SourceForge) Will do some optimizations and code changes (larger dictionary?) to make source portable and system independent.

  13. The Following 2 Users Say Thank You to encode For This Useful Post:

    Jan Ondrus (28th June 2013),Stephan Busch (26th June 2013)

  14. #13
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Glad to hear it. Now I can add it to the Silesia benchmark.

  15. #14
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Thumbs up, Encode!
    I longed to add it to my benchmark too.

  16. #15
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts


    Just uploaded an original source from 2011, "AS-IS". So, please consider it as a "DRAFT".



    I need to put some work on it.

    Anyway, please check out the source and post your comments & suggestions!

  17. #16
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    CRUSH with different window sizes (Default is 1M)

    ENWIK9:
    WindowComp.SizeComp.Time
    128K317,968,54475.169 sec
    256K306,253,573137.885 sec
    512K296,136,211251.812 sec
    1M287,333,602450.193 sec
    2M279,853,5051191.344 sec
    4M273,706,0842791.356 sec
    8M268,778,6545192.447 sec

    ENWIK8:
    WindowComp.SizeComp.Time
    128K35,858,2088.872 sec
    256K34,602,27616.402 sec
    512K33,505,90730.537 sec
    1M32,577,33855.836 sec
    2M31,786,650144.529 sec
    4M31,130,536341.520 sec
    8M30,606,794621.832 sec

    dickens:
    WindowComp.SizeComp.Time
    128K3,833,3311.558 sec
    256K3,663,5762.968 sec
    512K3,540,4205.642 sec
    1M3,437,7379.891 sec
    2M3,358,36425.099 sec
    4M3,315,17055.561 sec
    8M3,298,29176.435 sec

    The decompression is equally fast; with cf/c modes CRUSH is fast enough with any window size. ex mode is for max compression anyway!

  18. #17
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Here we go:
    * porting to FreeBSD (only to make it run) has been easy, I guess that since you made a ubuntu binary, you did everything already.
    * the code is nearly C-compatible. What do you think about switching language? It would make crush more portable.
    * bit I/O...hmm...not a top performer and I don't immediately see a way to improve it. Still, crush used to be competitive.
    * Would be good to have access to the latest sources. You did some porting already and published a binary, I'm interesting in what did you change.
    * I need to convert crush to a library. Would you be willing to accept the patches in your trunk?
    * It looks quite good for a draft.
    * I'm positively surprised by memory usage.
    * Overall, I'm pretty excited about it.
    Last edited by m^2; 27th June 2013 at 19:07.

  19. The Following User Says Thank You to m^2 For This Useful Post:

    encode (28th June 2013)

  20. #18
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    • (sizeof(int) == 4) with most 32 and 64-bit compilers (except of an old 16-bit DOS stuff). I might change it to int32_t eventually. Anyway, I'm a Visual C++ guy
    • C++ gives me more freedom in some cases
    • As to library, I already added a Crush namespace: Crush::Src=xxx; Crush::Dst=yyy; Crush::Compress(0);
    • I might enlarge the default window size to 2 MB

  21. #19
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    BTW, I'll never try to keep source to be compiled with any compiler on any platform with no warnings at all - it's not really possible. Or else, CRUSH source will look like LZOP - completely unreadable...

  22. #20
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by encode View Post
    (sizeof(int) == 4) with most 32 and 64-bit compilers (except of an old 16-bit DOS stuff). I might change it to int32_t eventually. Anyway, I'm a Visual C++ guy
    Ah, I've mistaken it with long, you're right.
    C++ gives me more freedom in some cases
    OK. It's just that the published code didn't really use it.
    As to library, I already added a Crush namespace: Crush::Src=xxx; Crush:st=yyy; Crush::Compress(0);
    I suggest making it an object, so it can be thread-safe.
    I might enlarge the default window size to 2 MB
    What do you think about making it a runtime variable? Or, if it makes compiler optimise less, a template?
    BTW, I'll never try to keep source to be compiled with any compiler on any platform with no warnings at all - it's not really possible. Or else, CRUSH source will look like LZOP - completely unreadable...
    It can be portable and readable. Look at LZ4, if not for speed optimisations, it would be very fine. Though adding a ton of definitions is not avoidable really.
    BTW, with compiles cleanly with gcc 4.6 with -Wall -Wextra -pedantic already. Though it's not so good with Clang.

  23. #21
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I thought the code was easy enough to figure out, so I posted a description of the algorithm to http://mattmahoney.net/dc/text.html#2873
    Please let me know if I misunderstood. I didn't re-test, but I recalculated the size based on the source code being smaller.

    Edit: updated Silesia benchmark. http://mattmahoney.net/dc/silesia.html
    Last edited by Matt Mahoney; 28th June 2013 at 03:42.

  24. #22
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    Just one thing, match finder hashes 3 and 4-byte strings - not an order-3 and order-4 contexts. (Context is a preceding bytes of a current position, string - followed bytes.)
    "Penalty" scheme works in main string searching routine too. Thus compressor will prefer short matches at close distances rather than slightly longer matches at extremely far distances. And penalty is 1-byte per 4 extra bits of offset compared to previous closest offset. Accordingly, at Lazy Matching the compressor will not drop a current match if next match is longer, but has too distant offset compared to the current one.

  25. #23
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Thanks. I updated the wording so I hope it is more clear. With regard to the poll on window size, I'm not sure why this should have a big effect on speed. Does it implicitly increase the length of the hash chain? I did a comparison of crush cx enwik8 with zpaq -method 2 -threads 1 on a 2.0 GHz T3200 (Win32), since crush is single threaded and both use 64 MB blocks. I used -threads 1 to decompress also to make the comparison fair.

    crush -> 32,577,338, 448s, 1.27s
    zpaq -> 32,785,289, 29s, 4.3s

    zpaq takes longer to decompress partly because it verifies SHA-1 checksums. Turning this off with -fragile cuts the time to 3.4s, which is still slower than crush, but still faster than my disk can write.

    zpaq doesn't implement 1 byte look ahead, but I am inspired by your code to try it. But it seems to me that testing a hash chain of 4K each for a direct match and a 1 byte look ahead seems excessive. Also, if the look ahead wins, crush just codes a literal and repeats the search on the next iteration. I think it would be faster if the result of that search was saved. One optimization I plan to try is to start the search from the second byte to find the longest match, then test if the first byte is the same and code accordingly.

    What I currently do with -method 2 compression is have 2 hash tables of 8 and 4 byte strings with 16 pointers in each bucket instead of a linked list. The long strings are tested first. If a match of length 8 is found, the second set is skipped. If a match of 128 or longer is found at any point, it is taken immediately. Ties are broken by choosing the smaller offset, which I guess is not optimal because offset code lengths can vary by 23 bits.

    The buckets are aligned on 64 byte cache boundaries. To reduce cache misses when searching the output buffer, zpaq packs the low 6 bits of the first byte with the 26 bit pointers and compares it first before comparing the strings in the buffer. For bigger blocks, fewer bits are packed. I do need to try your idea of comparing the last byte of the match first.

    Also, I don't let the hash chain get arbitrarily long. I use a quick pseudo-random hash of the current buffer position to choose one of the 16 bucket elements at random to update. Previously I used the low bits of the buffer position to choose an element, which works almost as well.

    I think that the coding efficiencies are similar. Instead of coding a literal in 9 bits, zpaq emits a 2 bit code and the literal length code using an Elias Gamma code, i.e. coding an n bit number using 2n-1 bits. Instead of 1..5 literals being coded in 9, 18, 27, 36, 45 bits, it would be 11, 21, 29, 39, 47 bits. Long literal strings would be coded slightly smaller. For match lengths, I again use an Elias Gamma code for the high bits and code the low 2 bits directly.

    For match offsets, zpaq uses 5 bits to code the number of offset bits over 24 possibilities like 0..23 for 16 MB blocks or 2..25 for 64 MB blocks. This is similar to crush except for allowing longer offsets and implicitly assigning a probability of 3/4 to a match instead of 1/2. The correct value depends on the source, but I think we agree pretty much on the distribution of match lengths and offsets. An offset of n has probability proportional to about 1/(n+16). A match length of n has a probability proportional to 1/(n+4)^2. I found literals of length n have probability proportional to 1/n^2.

  26. #24
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    Yep, for a large dictionary we must hash 5-byte strings or use something more advanced than just simple hash chains - a la binary tree or such. With 4-byte strings hash chains are way too long, especially with text data.
    ZLib/GZip uses Lazy Evaluation you described. CRUSH is intended to use with "c" and "cf" modes, "cx" is uncompromized mode for max compression at any cost. Also my approach is suitable for 2-byte lookahead optimized matching (check my GZip Hack), unfortunately such thing is not suitable for coders that simply store the literals.
    Literal runs may complicate parsing optimization - you must take into account literal run state - i.e. no reason to drop a match (produce a literal) if previously we code a match (literal run = 0)

  27. #25
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    encode, I vote for 8 MB. "ex mode is for max compression anyway!"

  28. #26
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    Added 9 compression levels - from Fast to Extreme
    Code:
    CRUSH by Ilya Muravyov, v1.00
    Usage: CRUSH command infile outfile
    Commands:
      c[1..9] Compress (Fast..Extreme)
      d       Decompress

  29. #27
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    Timings on ENWIK8 (2 MB dictionary)
    1 - 100000000 -> 41166959 in 1.533s
    2 - 100000000 -> 38966076 in 1.689s
    3 - 100000000 -> 37282880 in 2.011s
    4 - 100000000 -> 35967018 in 2.708s
    5 - 100000000 -> 34938903 in 3.856s
    6 - 100000000 -> 33566700 in 10.079s
    7 - 100000000 -> 32890930 in 26.581s
    8 - 100000000 -> 32635677 in 49.457s
    9 - Not yet implemented

    5 is the default, 8 and 9 are Max and Extreme modes for maniacs

  30. #28
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Very nice results!

  31. #29
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    Well, CRUSH source was completely rewritten - I've changed the programming style to be more clean and neat, added some comments. Some parts of code was optimized - now decompressor is slightly faster. Checked Linux compatibility (KNOPPIX LiveCD) - for full compatibility, I'm thinking to remove that X -> X in X sec stuff.
    Now I'm (re)optimizing the bit codes for match length with my brute force optimizer, as well as other parameters like penalty function etc. My goal is a higher compression compared to original CRUSH with the same window size!
    Please vote for window(dictionary) size above!

  32. #30
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,965
    Thanks
    367
    Thanked 343 Times in 134 Posts
    I almost finished working on CRUSH. Bit codes are pretty adequate, so I'll keep the compression format as is. Slightly improved parsing:

    ENWIK8:

    CRUSH 1M:
    cf - 100000000 -> 37408925 in 1.918s
    c - 100000000 -> 33581076 in 13.166s
    cx - 100000000 -> 32529467 in 52.572s

    CRUSH 2M
    cf - 100000000 -> 37308893 in 2.043s
    c - 100000000 -> 32878537 in 25.24s
    cx - 100000000 -> 31731711 in 121.243s

    world95.txt:

    CRUSH 1M:
    cf - 2988578 -> 870204 in 0.062s
    c - 2988578 -> 722986 in 0.249s
    cx - 2988578 -> 693217 in 1.06s

    CRUSH 2M:
    cf - 2988578 -> 867591 in 0.078s
    c - 2988578 -> 709912 in 0.358s
    cx - 2988578 -> 678869 in 1.653s


  33. The Following 2 Users Say Thank You to encode For This Useful Post:

    Nania Francesco (1st July 2013),samsat1024 (1st July 2013)

Page 1 of 2 12 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •