Page 1 of 2 12 LastLast
Results 1 to 30 of 47

Thread: MCM + LZP

  1. #1
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts

    MCM + LZP

    Hey all, I took a 1 + 1/2 year break from doing compression stuff but I recently started messing around with CM + LZP. Attached is MCM 64 bit.

    The current version has 2 modes, -fast and -high.
    -fast is 6 context CM + LZP
    -high is 8 context CM + LZP (default -high)
    The LZP is very basic, it just recycles the same bit processing as the normal CM. It uses 256 + expected char for the order 0 xor context instead of the normal 8 bit order o0.
    If the match model has an expected char and the bit is a non match, it uses this info through sse to help reduce non match char costs. When there is no match model match, it just does a normal byte without processing an LZP bit.
    Speed should be similar to mcm03, or a bit slower (for now). I wonder how ZCM has so much speed?

    Changes:
    Random tweaking, added SSE for helping the LZP.
    Added -10 and -11 for more memory which uses around 2.8GB / 5.5GB (only on 64 bit versions).
    Huffman disabled.
    E8E9 filter (always on lol, works OK for most files unless lots of false positives).
    64 bit support
    linux support (./make.bat on ubuntu should compile).

    Sample results with -9:
    sfc: 10,382,334
    enwik9: 160,663,427
    enwik8: 19,505,098
    silesia: 41,025,259

    Source code is updated on my github:
    https://github.com/mathieuchartier/mcm

    I think there are probably more ways to improve the LZP by having less models / more specialized models for the LZP predictor bit.
    Attached Files Attached Files
    • File Type: 7z mcm.7z (521.2 KB, 504 views)

  2. The Following 6 Users Say Thank You to Mat Chartier For This Useful Post:

    Bulat Ziganshin (5th February 2015),comp1 (6th February 2015),Matt Mahoney (5th February 2015),Nania Francesco (5th February 2015),Stephan Busch (5th February 2015),surfersat (6th February 2015)

  3. #2
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Updated silesia and LTCB benchmarks.
    http://mattmahoney.net/dc/silesia.html
    http://mattmahoney.net/dc/text.html (went from 20'th to 12'th place )

  4. The Following User Says Thank You to Matt Mahoney For This Useful Post:

    Mat Chartier (5th February 2015)

  5. #3
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    323
    Thanks
    177
    Thanked 51 Times in 37 Posts
    Quote Originally Posted by Mat Chartier View Post
    Hey all, I took a 1 + 1/2 year break from doing compression stuff but I recently started messing around with CM + LZP. Attached is MCM 64 bit.

    The current version has 2 modes, -fast and -high.
    -fast is 6 context CM + LZP
    -high is 8 context CM + LZP (default -high)
    The LZP is very basic, it just recycles the same bit processing as the normal CM. It uses 256 + expected char for the order 0 xor context instead of the normal 8 bit order o0.
    If the match model has an expected char and the bit is a non match, it uses this info through sse to help reduce non match char costs. When there is no match model match, it just does a normal byte without processing an LZP bit.
    Speed should be similar to mcm03, or a bit slower (for now). I wonder how ZCM has so much speed?

    Changes:
    Random tweaking, added SSE for helping the LZP.
    Added -10 and -11 for more memory which uses around 2.8GB / 5.5GB (only on 64 bit versions).
    Huffman disabled.
    E8E9 filter (always on lol, works OK for most files unless lots of false positives).
    64 bit support
    linux support (./make.bat on ubuntu should compile).

    Sample results with -9:
    sfc: 10,382,334
    enwik9: 160,663,427
    enwik8: 19,505,098
    silesia: 41,025,259

    Source code is updated on my github:
    https://github.com/mathieuchartier/mcm

    I think there are probably more ways to improve the LZP by having less models / more specialized models for the LZP predictor bit.
    Great to see this and to see you back. Don't take any more breaks from compression!

  6. #4
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Almost ready to release 0.81 which has a nice improvement for exe / binary and slight improvement on text. One thing I keep wondering about is how to improve performance without hurting compression much?

    So far, my ideas for increasing speed are:
    Faster LZP match bit prediction (hard to do without hurting compression).
    Re-enable huffman preprocessing. However, the huffman benefits are mitigated by LZP and static huffman doesn't work really well for large files.
    Multi-threading: Usually hurts compression, lot of work to implement.
    Re-enable prefetching, probably only helps a little bit.
    Use dictionary transform instead of LZP for text (hard to implement).

    Anyone else have ideas?

  7. #5
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    323
    Thanks
    177
    Thanked 51 Times in 37 Posts
    Quote Originally Posted by Mat Chartier View Post
    Almost ready to release 0.81 which has a nice improvement for exe / binary and slight improvement on text. One thing I keep wondering about is how to improve performance without hurting compression much?

    So far, my ideas for increasing speed are:
    Faster LZP match bit prediction (hard to do without hurting compression).
    Re-enable huffman preprocessing. However, the huffman benefits are mitigated by LZP and static huffman doesn't work really well for large files.
    Multi-threading: Usually hurts compression, lot of work to implement.
    Re-enable prefetching, probably only helps a little bit.
    Use dictionary transform instead of LZP for text (hard to implement).

    Anyone else have ideas?
    Mat,

    I don't have any specific ideas for you. But I would say that the speed of 0.80 is very good already. Already close in speed to CCMX and in many cases MCM produces a better ratio at -9 level than CCMX at 7 level.

    Perhaps your goals for MCM are more focused on improving speed but improving compression while keeping current speed seems like a good idea to me.

    Perhaps another option would be to incorporate other filters? CCMX excels in certain cases over MCM because of it's specific data filters. You mentioned EXE and text filters. What about a BMP or WAV filter? But, then again, perhaps what you are doing trying to improve upon the EXE and text filters beyond the 0.81 improvements is all that is necessary.

    Sorry I cannot be of more help...I am not a scientist or a programmer.

    Good luck and keep working on it!

  8. #6
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    873
    Thanks
    460
    Thanked 175 Times in 85 Posts
    If you implement the use of wildcards and recursion, you can let MCM switch off models that would give no gain on certain files
    while switching those models on that would help - p.ex. an audio model on every WAV/AIFF, a PGM model for PGM, a PPM model for PPM images.
    In both cases the text model can be switched off. While on texts, we would need the text model but not models for audio, PGM, PPM..
    In my test cases, Huffman did not bring improvements.
    If I remember right, programs like UHARC and CCMX tried the delta filter every few kb and checked if it gives gain.
    If yes it will be enabled, if not, it is not enabled.
    I think that dictionary transform would be much better for text than LZP. I like this idea.
    I would also vote for a light Huffman recompressor.

  9. #7
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    mcm v0.81:
    Significant improvements to binary models, slight improvements to text model. Added prefetch to match model and a few other things for speed tuning.
    Added -turbo (3 contexts), -fast (4 contexts), and -max (10 contexts)


    Reference mcm v0.8:
    SFC -9:
    -high 10,382,334 in 38.198s
    Silesia -9:
    -high 41,025,259 in 143.9s


    New mcm v0.81:
    SFC -9:
    -turbo (3 ctx): 10,733,617 in 21.629s
    -fast (4 ctx): 10,555,083 in 26.109s
    -mid (6 ctx): 10,313,325 in 31.408s
    -high (8 ctx): 10,251,772 in 35.866s
    -max (10 ctx): 10,208,213 in 38.358s
    Silesia -9:
    -fast: 42,200,550 in 91.2s
    -mid: 40,632,970 in 116.956s
    -high: 40,330,795 in 136.879s
    -max: 40,151,163 in 148.767s


    Turns out testing whether a bit is set is fast enough to do a few times per byte.
    I added an easy way to programmatically enable / disable models. Example of current models for binary profile:
    if (inputs > idx++) enableModel(kModelOrder1);
    if (inputs > idx++) enableModel(kModelOrder2);
    if (inputs > idx++) enableModel(kModelSparse34);
    if (inputs > idx++) enableModel(kModelOrder4);
    if (inputs > idx++) enableModel(kModelSparse23);
    if (inputs > idx++) enableModel(kModelMask);
    if (inputs > idx++) enableModel(kModelSparse4);
    if (inputs > idx++) enableModel(kModelOrder3);
    if (inputs > idx++) enableModel(kModelSparse2);
    if (inputs > idx++) enableModel(kModelSparse3);


    I made a branch for v0.81, so that when I do incremental development, the source is still visible.
    https://github.com/mathieuchartier/mcm/tree/v0.81
    Attached Files Attached Files
    • File Type: 7z mcm.7z (451.6 KB, 664 views)

  10. The Following User Says Thank You to Mat Chartier For This Useful Post:

    Stephan Busch (10th February 2015)

  11. #8
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    323
    Thanks
    177
    Thanked 51 Times in 37 Posts
    Mat,

    Great improvement in compression!

    HFCB's vm.dll
    Code:
    SIZE       TIME        PROGRAM
    =========================================
    958143613  1990.67500  mcm 0.80 -9 -high
    881842673  2126.16200  mcm 0.81 -9 -max
    Again, excellent improvement!

  12. The Following User Says Thank You to comp1 For This Useful Post:

    Mat Chartier (11th February 2015)

  13. #9
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    MCM v0.82, minor update:
    Updates:
    Tuned x86 filter to have less bad transforms on non x86 data.
    Tuning some constants through hill climb / brute force, improved match model handling.
    More prefetching + other optimizations (~5-10% faster vs 0.81).

    TODO:
    Rewrite text / binary detector + add more filters (lot of work).

    Enwik8 -9 -high:
    v0.80: 19505098 in 90.556s
    v0.81: 19381820 in 85.029s
    v0.82: 19318179 in 77.556s

    SFC -9:
    -mid: 10,307,681
    -high: 10,246,239
    -max: 10,207,930

    Silesia -9:
    -mid: 40,541,614
    -high: 40,251,527
    -max: 40,091,648

    Source at: https://github.com/mathieuchartier/mcm/tree/v0.82
    Attached Files Attached Files

  14. The Following 3 Users Say Thank You to Mat Chartier For This Useful Post:

    Cyan (16th February 2015),Nania Francesco (16th February 2015),Stephan Busch (16th February 2015)

  15. #10
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    323
    Thanks
    177
    Thanked 51 Times in 37 Posts
    Nice improvement Mat

    Code:
    SIZE       TIME        PROGRAM
    =========================================
    881842673  2126.16200  mcm 0.81 -9 -max
    879245585  1846.88800  mcm 0.82 -9 -max
    Also, I did some other quick tests and MCM 0.82 beats ZCM and CCMX every time for compression ratio. You are on your way to beating NanoZip. Once data type detection/filters are implemented, I believe you will beat NanoZip for ratio (since you have already beaten it for speed).

    Don't give up on this project.
    Last edited by comp1; 18th February 2015 at 06:54.

  16. #11
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    I am proud to announce mcm v0.83 which has nice improvements in text.


    New features:
    2 pass approach: analyze then compress streams one at a time
    Dynamic dictionary preprocessor using 0 - 600000+ words depending on file.
    Special .wav compressor (not great ratio, but fast at >10MB/sec).
    -store option for using just filters and dictionary preprocessor (undocumented)

    New LZP idea for text, only use LZP if the match length is greater than ~13, otherwise use match model. This works better than never using LZP according to my tests.


    Sample results:
    calgary.tar -x9: 667,097 in 3s
    silesia -m9: 40,340,745 in 102s
    silesia -x9: 39,772,294 in 135s
    enwik8 -m9: 18,623,547 in 51s
    enwik8 -m9: 18,376,397 in 69s


    Future ideas:
    Is there a way to do a per sample CM to for audio samples? Logistic mixing makes no sense, but something like paq6 linear mixing might work. I want a way to have negative weights though.


    Caution:
    This version has a lot of changes and could be buggy. The dictionary building memory usage could be HUGE if the file is very large. Also 32 bit version isn't very tested and might be broken on >4GB files.

    Source at:
    https://github.com/mathieuchartier/mcm/tree/v0.83
    Attached Files Attached Files

  17. The Following 5 Users Say Thank You to Mat Chartier For This Useful Post:

    Bulat Ziganshin (10th April 2015),comp1 (5th April 2015),Jan Ondrus (13th May 2015),Nania Francesco (5th April 2015),Stephan Busch (5th April 2015)

  18. #12
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    323
    Thanks
    177
    Thanked 51 Times in 37 Posts
    Good improvement!

    Code:
    SIZE       TIME        PROGRAM
    =========================================
    879245585  1846.88800  mcm 0.82 -9 -max
    874226934  1336.988    mcm 0.83 -x9
    Keep working on it!

  19. The Following User Says Thank You to comp1 For This Useful Post:

    Mat Chartier (5th April 2015)

  20. #13
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    I am glad that you have heard suggestions by inserting the LZP in the program. Definitely did not expect these great strides. Only one thing I wanted you to consider: I've seen that uses float numbers yet and I don't think this makes it compatible with all CPUs mcm (you may test the archives with various CPUs). However the strength of the program is definitely in text processor which is a strong point (I hope soon to see the results of Matt on LTCB)
    Last edited by Nania Francesco; 10th April 2015 at 17:27.

  21. The Following 2 Users Say Thank You to Nania Francesco For This Useful Post:

    Bulat Ziganshin (10th April 2015),Mat Chartier (10th April 2015)

  22. #14
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Hi Nania, I think the latest versions "should" not use floats any more, where did you see the float numbers in the coding process? I moved to using stems for squash / stretch a while back. One thing you may have seen is that dictionary preprocessing is non deterministic, i.e. 32 bit mcm has a slightly different dictionary than 64 bit. However this is benign since I encode the dict at the start of the file.

    I still feel like the MCM LZP is not as strong as ZCM's LZP, for example, FP.log is slower and less compressed with MCM (use -filters=none) than ZCM 0.92. When you encode LZP, do you use a single bit for LZP / no LZP, or encode a whole number length for the match length.

  23. #15
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    I'm glad you removed the float calculations (I was wrong evidently). I'm at work to version 0.93 of ZCM and it will be difficult to compete with current MCM.

  24. #16
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    O__O


    Strange that its gone unnoticed but MCM 0.83 -x11 enters the Top-10 of LTCB.

    Code:
    
    mcm 0.83 x64 -x11 enwik8	18 233 295	44.500s		5839 MB		40.203s		5707 MB
    mcm 0.83 x64 -x11 enwik9	144 854 575	394.421s	5961 MB		281.250s	5723 MB
    
    Decompresser is                 79 574 (sources from GitHub, size optimized by removing comments, KZIP /b512)
    So, with overall size of 144 934 149 bytes, MCM 0.83 -x11 takes 7th place beating such monsters as NanoZip, xwrt, fp8 and WinRK.
    Considering the speed MCM reaches this result with, its simply astonishing.

    Decompression is correct, tested on i7-2700K, WinXP x64 SP2. Size optimized sources attached.
    Attached Files Attached Files

  25. The Following 5 Users Say Thank You to Skymmer For This Useful Post:

    comp1 (22nd April 2015),Cyan (23rd April 2015),kaitz (22nd April 2015),Mat Chartier (22nd April 2015),willvarfar (23rd April 2015)

  26. #17
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Thanks for measuring this, I think there is still a fair bit of area for improvement by better assigning codes for the dictionary. Keep in mind that beating those compressors but using 3-4x the RAM isn't exactly apples to apples

    Right now I have basic multi file support done for windows, and I am also working on simple block based dedupe. These will be part of 0.84, but I don't think there will be any significant changes to the dictionary or CM core.

    EDIT:
    Thx Matt, I'll try to make sure that is fixed in the next version.
    Last edited by Mat Chartier; 22nd April 2015 at 23:35.

  27. #18
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    It's about time I updated LTCB. @Skymmer, thanks for testing. http://mattmahoney.net/dc/text.html#1449

    mcm32.exe -x9 failed on silesia/dickens. Decompression stalled, and when I killed it the output was the right size but all zero bytes. The rest of silesia was OK.

  28. #19
    Member
    Join Date
    Jul 2013
    Location
    Stanford, California
    Posts
    24
    Thanks
    7
    Thanked 2 Times in 2 Posts
    Hi Mat,

    Thanks for mcm. Can you help with compilation on Linux and FreeBSD (and, lowest priority, OS X)? It seems that make.sh refers to a missing File.cpp. And then clang++ gave some errors on OS X. Also if anyone else has gotten it to compile and might have a static executable that would be great.

  29. #20
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I don't see File.cpp in make.sh. I just tried again in 64 bit Ubuntu/g++ 4.8.2 and make.sh works both on the github sources and Skymmer's optimized sources. make.sh looks like this:

    g++ -DNDEBUG -O3 -fomit-frame-pointer -msse2 -std=c++0x -D_FILE_OFFSET_BITS=64 -o mcm CM.cpp Archive.cpp Huffman.cpp MCM.cpp Memory.cpp Util.cpp Compressor.cpp LZ.cpp -lpthread

  30. #21
    Member
    Join Date
    Jul 2013
    Location
    United States
    Posts
    194
    Thanks
    44
    Thanked 140 Times in 69 Posts
    Quote Originally Posted by Matt Mahoney View Post
    I don't see File.cpp in make.sh. I just tried again in 64 bit Ubuntu/g++ 4.8.2 and make.sh works both on the github sources and Skymmer's optimized sources. make.sh looks like this:

    g++ -DNDEBUG -O3 -fomit-frame-pointer -msse2 -std=c++0x -D_FILE_OFFSET_BITS=64 -o mcm CM.cpp Archive.cpp Huffman.cpp MCM.cpp Memory.cpp Util.cpp Compressor.cpp LZ.cpp -lpthread
    Seems like you're using an old checkout. Current git looks like this:

    Code:
    g++ -DNDEBUG -O3 -fomit-frame-pointer -msse2 -std=c++0x  -D_FILE_OFFSET_BITS=64 -o mcm CM.cpp Archive.cpp Huffman.cpp MCM.cpp  Memory.cpp Util.cpp Compressor.cpp LZ.cpp File.cpp -lpthread
    The issue has already been reported (just over two weeks ago).

  31. #22
    Member
    Join Date
    Sep 2011
    Location
    uk
    Posts
    237
    Thanks
    187
    Thanked 16 Times in 11 Posts
    Did I miss something? What is mcm32 in your .7z distribution - assumed a 32 bit version. However doesn't work on xp- 'not a win32 program'. John

  32. #23
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    i'm adding these flags when compiling with msvc:
    -link -MACHINE:x86 -SUBSYSTEM:CONSOLE,5.01 -LARGEADDRESSAWARE
    -link -MACHINE:x64 -SUBSYSTEM:CONSOLE,5.02

    on the built executable you can do


    editbin.exe a.exe /SUBSYSTEM:CONSOLE,5.01 /LARGEADDRESSAWARE

  33. #24
    Member
    Join Date
    Sep 2011
    Location
    uk
    Posts
    237
    Thanks
    187
    Thanked 16 Times in 11 Posts
    thanks - so can the distr plese be updated Matt, mcm32 runs on xp. TIA John

  34. #25
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Yes, I was using an old checkout.

  35. #26
    Member
    Join Date
    Jul 2015
    Location
    Algeria
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    MCM v0.83 Missing File.cpp

    Quote Originally Posted by Matt Mahoney View Post
    I don't see File.cpp in make.sh. I just tried again in 64 bit Ubuntu/g++ 4.8.2 and make.sh works both on the github sources and Skymmer's optimized sources. make.sh looks like this:

    g++ -DNDEBUG -O3 -fomit-frame-pointer -msse2 -std=c++0x -D_FILE_OFFSET_BITS=64 -o mcm CM.cpp Archive.cpp Huffman.cpp MCM.cpp Memory.cpp Util.cpp Compressor.cpp LZ.cpp -lpthread

    When i compile under Linux Mint Cinnamon 17.1 i got this:
    -------------------------------------------------------------------------------------------
    blackhole@galaxy ~/mcm $ ./make.sh
    g++: error: File.cpp: No such file or directory

    This the content of "make.sh":
    ----------------------------------------------
    g++ -DNDEBUG -O3 -fomit-frame-pointer -msse2 -std=c++0x -D_FILE_OFFSET_BITS=64 -o mcm CM.cpp Archive.cpp Huffman.cpp MCM.cpp Memory.cpp Util.cpp Compressor.cpp LZ.cpp File.cpp -lpthread

    When i remove "File.cpp" from "make.sh" i got this:
    --------------------------------------------------------------------------------
    /tmp/ccDDyso4.o: In function `Archive::writeBlocks()':
    Archive.cpp.text+0x289c): undefined reference to `FileList::write(Stream*)'
    /tmp/ccDDyso4.o: In function `Archive::readBlocks()':
    Archive.cpp.text+0x2caa): undefined reference to `FileList::read(Stream*)'
    /tmp/ccDDyso4.o: In function `Archive::list()':
    Archive.cpp.text+0x2d6f): undefined reference to `FileInfo::attrToStr(unsigned short)'
    /tmp/ccDDyso4.o: In function `Archive::compress(std::vector<FileInfo, std::allocator<FileInfo> > const&)':
    Archive.cpp.text+0x3aa9): undefined reference to `FileList::addDirectoryRec(std::string const&, std::string const*)'
    /tmp/ccE9QpCX.o: In function `Options:arse(int, char**)':
    MCM.cpp.text._ZN7Options5parseEiPPc[_ZN7Options5parseEiPPc]+0xf: undefined reference to `FileInfo::FileInfo(std::string const&, std::string const*)'
    MCM.cpp.text._ZN7Options5parseEiPPc[_ZN7Options5parseEiPPc]+0x3e6): undefined reference to `FileInfo::FileInfo(std::string const&, std::string const*)'
    MCM.cpp.text._ZN7Options5parseEiPPc[_ZN7Options5parseEiPPc]+0x476): undefined reference to `FileInfo::FileInfo(std::string const&, std::string const*)'
    MCM.cpp.text._ZN7Options5parseEiPPc[_ZN7Options5parseEiPPc]+0x600): undefined reference to `FileInfo::FileInfo(std::string const&, std::string const*)'
    MCM.cpp.text._ZN7Options5parseEiPPc[_ZN7Options5parseEiPPc]+0x689): undefined reference to `FileInfo::FileInfo(std::string const&, std::string const*)'
    /tmp/ccE9QpCX.o:MCM.cpp.text._ZN7Options5parseEiPPc[_ZN7Options5parseEiPPc]+0x9a9): more undefined references to `FileInfo::FileInfo(std::string const&, std::string const*)' follow
    /tmp/ccE9QpCX.o: In function `main':
    MCM.cpp.text.startup+0x9a): undefined reference to `FileInfo::FileInfo()'
    collect2: error: ld returned 1 exit status

    Note:
    -------
    take in consideration that it's not an option to use the old "File.hpp" since that one contain a different version of the "class FileInfo, FileList"

    i hope to take this issue in consideration and thank anyway for that beautiful code .

  36. #27
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    MCM v0.83 skbuild1
    - now you can set memory level up to 13 allowing slightly better compression (be carefull, too memory hungry!)
    - console output have been redesigned for better viewing but also it reveals some undocumented options for more precise control over filtering and lzp
    - 32bit compile is static GCC instead of MSVC so no more requirements for MSVC runtimes
    Attached Files Attached Files

  37. The Following 2 Users Say Thank You to Skymmer For This Useful Post:

    avitar (13th July 2015),Mike (13th July 2015)

  38. #28
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    231
    Thanks
    90
    Thanked 47 Times in 31 Posts
    This is remarkable. MCM took a 279 KB CreateJS.js file down to 35 KB. The best I could do with 7Z was 46 KB.

    NOTE: It wouldn't compress JS files directly. I guess it's not supported. I had to convert them to .txt first.

    It would help if you could post the command line syntax and arguments. I had some trouble there. And the supported source file formats. I understand if you don't want to bother with direct JS support, since on the web we're forced to deliver JS as gzip anyway, but it would be helpful to know all the supported file types.

    I really wish we could get some movement on web compression formats beyond gzip. I'm not a gzip hater – it's unbelievably efficient and light, amazing that something written in the 80s and 90s could still be so hard to beat on CPU and memory use at a given comp ratio and decomp speed. But certainly we must be able to develop a format at this point that can match gzip CPU, memory efficiency, and decomp speed, while picking up a big win on compression ratio. MCM seems to have potential there.

  39. #29
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    231
    Thanks
    90
    Thanked 47 Times in 31 Posts
    Matt Mahoney, thanks for your benchmarks page. Question: Is the mem column really supposed to be in megabytes, as indicated in your legend? If so, it means lots of programs are using many GBs of RAM, like cmix v6 using 30+ GBs (first row). Most PCs don't have the amount of memory this table suggests these programs are using.

    Relatedly, could you post the specs and config for the machine you're using for the benchmarks? Sorry if it's already posted and I missed it...

  40. #30
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Yes, mem is in MB, and yes, cmix requires 30 GB of RAM. The note column shows the spec for the test machine. Most of my recent tests are on a Linux machine (note 48 ), but some of my older tests are on other machines. I realize that speed depends on CPU and compression ratio depends on memory. I don't want to put any limits on hardware. If your compressor is top ranked but only runs on a quantum computer then we learned something.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •