Page 1 of 2 12 LastLast
Results 1 to 30 of 38

Thread: New CM compressor in development

  1. #1
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts

    New CM compressor in development

    I have been working on an experimental CM compressor for the past few weeks, the performance has just recently gotten acceptable so I'd though that I'd release it closed source for now. It is not too complicated yet (no resolving hash collisions, SSE, ISSE, BCJ). By default its tuned for text but you can disable the word model. I welcome any feedback!
    Attached Files Attached Files

  2. The Following User Says Thank You to Mat Chartier For This Useful Post:

    Stephan Busch (3rd June 2013)

  3. #2
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Just realized I had left AVX extensions enabled in code gen, probably resulting in most people not being able to run the program. New version has this and a few other bugs fixed and a reduced initialization time.
    Attached Files Attached Files

  4. #3
    Member
    Join Date
    Jan 2010
    Location
    France
    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I quickly tested your MCM, I tried it on a few Tar-ed program directories and it performed better than winrar
    Although it was much slower.
    Good luck for the future of your program :3

  5. #4
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    873
    Thanks
    460
    Thanked 175 Times in 85 Posts
    MCM 0.0 is on rank #26 of the SqueezeChart, which means it is already in the Top 30
    I will publish results later.

  6. #5
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts

  7. #6
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Thank you both for running these benchmarks! I'll try to see if I can improve the speed any more, as well as binary / exe / text detection.

  8. #7
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    873
    Thanks
    460
    Thanked 175 Times in 85 Posts
    results are now online at http://www.squeezechart.com

  9. #8
    Member FatBit's Avatar
    Join Date
    Jan 2012
    Location
    Prague, CZ
    Posts
    189
    Thanks
    0
    Thanked 36 Times in 27 Posts
    Dear Mr. Chartier,

    did you produce only 64 bit versions? When I run newer or older program, I obtain message "This is not valid Win32 program.". Tested on Win XP SP3 CZECH version.

    Best regards,
    FatBit

  10. #9
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    That's strange, what CPU do you have? The compressor requires SSE2, but nearly every CPU should support this.

  11. #10
    Member FatBit's Avatar
    Join Date
    Jan 2012
    Location
    Prague, CZ
    Posts
    189
    Thanks
    0
    Thanked 36 Times in 27 Posts
    It is Intel Centrino Mobile Pentium M 1,5 GHz, ~10 years old + 855PM chipset.

  12. #11
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Ah ok, I'll see if I can remove the SSE2 requirements in the next version. It should be ready in around a week. Hopefully that will fix it.

  13. #12
    Member FatBit's Avatar
    Join Date
    Jan 2012
    Location
    Prague, CZ
    Posts
    189
    Thanks
    0
    Thanked 36 Times in 27 Posts
    May be different compilations will be good solution. Newer/faster and older/slower versions.

  14. #13
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    My test was in 32 bit Vista (2 GHz T3200) and it worked.

    BTW, ZPAQ requires SSE2 instructions. I thought every processor has them by now. If not, you can compile with -DNOJIT but it will be slow. I know somebody compiled an older version for ARM and it worked.

  15. #14
    Member FatBit's Avatar
    Join Date
    Jan 2012
    Location
    Prague, CZ
    Posts
    189
    Thanks
    0
    Thanked 36 Times in 27 Posts
    I sucessfully ran zpaq 6.28 and zpaqd 6.27 on Win XP SP3 CZECH edition 32 bit.

    Best Regards,
    FatBit

  16. #15
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    It's very strange that it doesn't work on windows XP. I'm using VS2012 to compile it so that might have something to do with it. On a side note, anybody know a good way to figure out where to add new states to a PAQ like state machine? I currently have 105/255 unused states. The state machine was generated with a simple brute force algorithm on enwik6.

  17. #16
    Member FatBit's Avatar
    Join Date
    Jan 2012
    Location
    Prague, CZ
    Posts
    189
    Thanks
    0
    Thanked 36 Times in 27 Posts
    If I remember correctly, user ENCODE had to downgrade from Visual Studio new to Visual Studio old because in new version was removed Win XP support (and partially returned later?). I am not able to find it in forum.

    Best regards,
    FatBit

  18. #17
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    You could use the StateTable class in the ZPAQ reference decoder to generate a PAQ state table. http://mattmahoney.net/dc/unzpaq200.cpp

    I had intended to have 255 states but due to a design error I discovered much later that only 219 states are reachable. I left it that way so I would not break compatibility with the standard.

    SSE2 is supported on Pentium M. It is supported on most Intel processors since 2001 and AMD since 2003. In ZPAQ, SSE2 is only required for the MIX component, so the faster methods that don't use it (1, 2, and 3) should still work. Or you can compile with -DNOJIT for any processor.

    ZPAQ will run on Windows XP, but probably not older versions. When I make calls to Windows I make sure the function is supported at least back to XP.

  19. #18
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    Quote Originally Posted by Mat Chartier View Post
    It's very strange that it doesn't work on windows XP. I'm using VS2012 to compile it so that might have something to do with it.
    http://stackoverflow.com/questions/1...al-studio-2012 :
    Visual Studio 2012 Update 1 has now been released, and adds official support for running apps built with VC++ 2012 on Windows XP.
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  20. #19
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I guess Microsoft forgot that 38% of PCs are still running WinXP, like it or not.

  21. #20
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    They try to push it out of the market with whatever oppoturnity they can. But the customer backlash is still very strong in many places.
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  22. #21
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Currently MCM, from my statistics and analysis from the results of the WCC that I will publish soon are truly remarkable. Of course I do not know if the program uses a system type PPM (Byte compression) or type CM (single-bit compression) but I think the way to go, if I can give some advice, is to make it simple and fast and not the other way around!

  23. #22
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    It uses CM.

  24. #23
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Hi Nania, Currently I'm using CM with 6 contexts: o1/match, word, o2, o3, o4, o6. Contexts are selected on a byte basis. I'm not too sure how to increase the speed any more, in mcm v0.0 each context rarely hits more than two cache lines in the hash table for encoding/decoding a byte. Using some xor tricks, I recently managed to get a guarantee that each context will hit at most 2 cache lines in the hash table, but this is only a very minor performance improvement. I guess the next lowest hanging fruit is match model, it takes around 20% of compression time.

    EDIT:
    Also, I was just thinking of floating point CM. With the new dot product (dpps) instruction that comes with SSE4, it may be a feasible option? What do you guys think.
    Last edited by Mat Chartier; 7th June 2013 at 20:37.

  25. #24
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    i think it's a great idea, but don't stop on that. ideally, archives should be decompressible only on i7-4770R in a full moon

  26. #25
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Agreed Bulat, we need to use these new instruction sets so that people with old CPUs finally upgrade.

    Although, I could check CPUID and have different code paths for older machines to make sure that the code runs. The main thing that I'm worried about is having consistent floating point behaviour on all machines.

  27. #26
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I guess you mean for the mixer. In zpaq I use SSE2 for the dot product using 20 bit weights and 12 bit predictions: drop 8 bits of the weight, multiply (PMADDWD), accumulate. But SSE2 turned out to be slower than scalar code to update the weights. It would have been faster to use 16 bit weights but in my experiments I lost too much compression. You could probably do it with probabilistic weight updates.

  28. #27
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Thanks for the answer Matt! I'm surprised that SSE2 wasn't faster than scalar code. I'll probably just stick to 32 bit integer weights for now.

  29. #28
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    SSE2 is faster for dot product of vectors of 16 bit signed elements, like in mixer prediction. It wasn't faster for updating 20 bit weights and bounding the values, even after I figured out how to do it in parallel.

  30. #29
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Mat Chartier View Post
    Agreed Bulat, we need to use these new instruction sets so that people with old CPUs finally upgrade.
    Screw your users, so you have a better justification for playing with new toys, huh?

  31. #30
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    776
    Thanks
    63
    Thanked 270 Times in 190 Posts
    Added mcm to last zpaq benchmark test, very good for single thread.

Page 1 of 2 12 LastLast

Similar Threads

  1. Demixer - new tree-based bitwise CM codec is in development
    By Piotr Tarsa in forum Data Compression
    Replies: 34
    Last Post: 17th March 2013, 20:33

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •