Page 2 of 2 FirstFirst 12
Results 31 to 38 of 38

Thread: New CM compressor in development

  1. #31
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,474
    Thanks
    26
    Thanked 121 Times in 95 Posts
    Floating-point instructions are defined to have different accuracy, but for many operations the accuracy should be 100% - I think that's the case for at least addition (ie addition should behave identically on every processor, provided that it's done with the same precision). Here's a quote from Intel's paper:
    2.2.14 IEEE 754 Compliance
    The six SSE4.1 instructions that perform floating-point arithmetic are:
    ? DPPS
    ? DPPD
    ? ROUNDPS
    ? ROUNDPD
    ? ROUNDSS
    ? ROUNDSD
    Dot Product operations are not specified in IEEE-754. When neither FTZ nor DAZ are
    enabled, the dot product instructions resemble sequences of IEEE-754 multiplies and
    adds (with rounding at each stage), except that the treatment of input NaN?s is
    implementation specific (there will be at least one NaN in the output). The input
    select fields (bits imm8[4:7]) force input elements to +0.0f prior to the first multiply
    and will suppress input exceptions that would otherwise have been be generated.
    I'm not 100% sure, but that could mean that DPPS result can be 100% reproducible without SSE4. The only caveat I'm thinking of is optimization. I would strongly recommend to use volatile asm instructions or maybe intrinsics to have 100% control over the code.
    But then you would have to restrict yourself from using instructions that aren't 100% accurate (per IEEE 754 standard) - that could negate performance advantages or make the code too complex to maintain.

    Here's a quote from nVidia paper:
    2.2 Operations and Accuracy
    The IEEE 754 standard requires support for a hand-
    ful of operations. These include the arithmetic opera-
    tions add, subtract, multiply, divide, square root, fused-
    multiply-add, remainder, conversion operations, scal-
    ing, sign operations, and comparisons. The results of
    these operations are guaranteed to be the same for all
    implementations of the standard, for a given format and
    rounding mode.
    Overall, having reproducible results with floating point is tricky, but not impossible, I think.

  2. #32
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    New version: mcm v0.2

    Changes:
    * Simple detection algorithm that automatically switches between binary and text modes, good for TAR / ISO files.
    * Improved match model.
    * Word model now handles UTF8.
    * Added binary mode which uses a few sparse contexts.
    * Less cache misses (No speed up due to slower match model).
    * mingw version for those with windows XP.

    TODO:
    * Speed improvements: better prefetching, multithreading
    * E8E9
    * More modes
    * SSE
    * Better UI / display
    Attached Files Attached Files

  3. The Following 5 Users Say Thank You to Mat Chartier For This Useful Post:

    GOZARCK (11th June 2013),Jan Ondrus (12th June 2013),Matt Mahoney (12th June 2013),samsat1024 (13th June 2013),Stephan Busch (12th June 2013)

  4. #33
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    GCC compilation is dynamic one and asks for libgcc_s_dw2-1.dll and libstdc++-6.dll.
    Since v4.5.2 used, everybody who's going to run GCC version, can find files here:
    http://sourceforge.net/projects/ming....lzma/download
    http://sourceforge.net/projects/ming....lzma/download

  5. The Following User Says Thank You to Skymmer For This Useful Post:

    Mat Chartier (11th June 2013)

  6. #34
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Updated LTCB. mcm moves up 1 spot. http://mattmahoney.net/dc/text.html#1656

  7. The Following User Says Thank You to Matt Mahoney For This Useful Post:

    Mat Chartier (13th June 2013)

  8. #35
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Version 0.3
    I'll probably put mcm on github "soon".

    Minor changes:
    States no longer have fixed probability (hehe, a bit slower).
    Improved sparse models.
    Improved match model.
    UI / display improvements.

    mingw44.tar:
    mcm02 -9: 36,280kB
    mcm03 -9: 35,865kB
    ccmx 7: 35,915kB
    Attached Files Attached Files

  9. The Following 4 Users Say Thank You to Mat Chartier For This Useful Post:

    Jan Ondrus (18th June 2013),Matt Mahoney (27th June 2013),Nania Francesco (25th June 2013),Stephan Busch (18th June 2013)

  10. #36
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    I think you should create a archiver with the solid to be able to get excellent results by entering a system similar to Paq9 which I was inspired for ZCM! If you want we can trade some piece of code if you're interested you can send me a message in the forum!

  11. #37
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    mcm now beats paq9a. http://mattmahoney.net/dc/text.html#1644

    Also, mcm_gcc.exe can't run because it is looking for some DLL files. (I have a different build of g++ installed). You can fix this problem by compiling with -static.
    Last edited by Matt Mahoney; 28th June 2013 at 03:13.

  12. #38
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Thanks Matt, I'll use that flag for future versions. As for open sourcing, I have been very busy with new job, but I am working on cleaning up the source code so that I can hopefully put mcm up on github soon (maybe this weekend). Nania, you should make ZCM open source, it is better than trading source code by PM Does ZCM uses a chain of mixers like paq9a? I thought that would be slow, but maybe I'm wrong.

    I think need to add some kind of dictionary preprocessor next, DRT confuses the binary detector but if I force text mode on I'm getting 151,134KB on enwik9 with ~50% speedup.

  13. The Following 3 Users Say Thank You to Mat Chartier For This Useful Post:

    Jan Ondrus (28th June 2013),Nania Francesco (5th July 2013),ZGish (28th June 2013)

Page 2 of 2 FirstFirst 12

Similar Threads

  1. Demixer - new tree-based bitwise CM codec is in development
    By Piotr Tarsa in forum Data Compression
    Replies: 34
    Last Post: 17th March 2013, 20:33

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •