Page 2 of 2 FirstFirst 12
Results 31 to 54 of 54

Thread: MCM + LZP

  1. #31
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    325
    Thanks
    18
    Thanked 6 Times in 5 Posts
    Quote Originally Posted by SolidComp View Post
    This is remarkable. MCM took a 279 KB CreateJS.js file down to 35 KB. The best I could do with 7Z was 46 KB.

    NOTE: It wouldn't compress JS files directly. I guess it's not supported. I had to convert them to .txt first.

    It would help if you could post the command line syntax and arguments. I had some trouble there. And the supported source file formats. I understand if you don't want to bother with direct JS support, since on the web we're forced to deliver JS as gzip anyway, but it would be helpful to know all the supported file types.

    I really wish we could get some movement on web compression formats beyond gzip. I'm not a gzip hater – it's unbelievably efficient and light, amazing that something written in the 80s and 90s could still be so hard to beat on CPU and memory use at a given comp ratio and decomp speed. But certainly we must be able to develop a format at this point that can match gzip CPU, memory efficiency, and decomp speed, while picking up a big win on compression ratio. MCM seems to have potential there.
    A bit OT but well you can deliver JS without GZIP, PNG is the popular choice for size coders. But for general web stuff i doubt you are that desperate to cram as much as possible into a small size ;p But some stuff to nerd out on if you like JS:

    Most commonly used tool: http://www.pouet.net/prod.php?which=59298

    Used for compo's like http://js1k.com/

    And any number of other scene events. Can get some pretty good ratios as seen in demos like https://www.youtube.com/watch?v=mQFjReMd2us in under 4096bytes. Find extra info and source here: http://www.ylilammi.com/webgl/highway4k/

  2. #32
    Member
    Join Date
    Jul 2013
    Location
    Stanford, California
    Posts
    24
    Thanks
    7
    Thanked 2 Times in 2 Posts
    Hi,

    I'm wondering what source changes you made to allow for the greater memory usage. I'm interested in a similar build but for Linux and a static binary.

    Quote Originally Posted by Skymmer View Post
    MCM v0.83 skbuild1
    - now you can set memory level up to 13 allowing slightly better compression (be carefull, too memory hungry!)
    - console output have been redesigned for better viewing but also it reveals some undocumented options for more precise control over filtering and lzp
    - 32bit compile is static GCC instead of MSVC so no more requirements for MSVC runtimes

  3. #33
    Member
    Join Date
    Jun 2013
    Location
    Canada
    Posts
    36
    Thanks
    24
    Thanked 47 Times in 14 Posts
    Hi, the main reason that my MCM version only allows for -11 is for 64 bit compability. One of the things I try to support is compressing with 32 bit MCM and decompressing with 64 bit MCM (or visa versa). To do this I use 32 bit hashes, and it happens that -11 uses all of the 32 bits for look ups. Without having bigger hashes, adding more ram will provide barely any benefits.

    I think the main changes which would be required would involve changing hashes to be uint64_t instead of uint32_t, though this would break 32 and 64 bit compatibility unless the hashing functions were changed.

  4. #34
    Member
    Join Date
    Sep 2017
    Location
    Czech
    Posts
    4
    Thanks
    2
    Thanked 3 Times in 1 Post
    Hi,
    there must be some bug in mcm (i tried v0.83 skbuild1 32bit from Skymmer).
    Compression always stops at the same place and does not continue (image attached).
    No error is reported.
    I tried it on two different computers with the same result.
    1) AMD Athlon X2 (I do not remember the exact model) 2 cores, 2GB RAM, Windows 7 32bit.
    2) Intel Core i-7700HQ 4 cores, 16GB RAM, Windows 10 64bit
    Click image for larger version. 

Name:	stopping.jpg 
Views:	169 
Size:	546.6 KB 
ID:	5239
    The 64-bit version works normally.
    Click image for larger version. 

Name:	64b-final.jpg 
Views:	128 
Size:	854.3 KB 
ID:	5240
    Any idea?

  5. #35
    Member
    Join Date
    Feb 2015
    Location
    United Kingdom
    Posts
    171
    Thanks
    28
    Thanked 73 Times in 43 Posts
    Not sure if anyone else has noticed but MCM v0.84 was released quite some time ago: https://github.com/mathieuchartier/m...rchive.hpp#L83

  6. #36
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    876
    Thanks
    474
    Thanked 175 Times in 85 Posts
    Is there a compile available?

  7. #37
    Member
    Join Date
    May 2017
    Location
    Sealand
    Posts
    15
    Thanks
    7
    Thanked 2 Times in 2 Posts
    can someone provide a mcm 0.84 build for testing?

  8. #38
    Member
    Join Date
    Apr 2009
    Location
    here
    Posts
    204
    Thanks
    172
    Thanked 110 Times in 66 Posts
    there are errors when compiling under mingw, the code needs some fixing.

  9. #39
    Member
    Join Date
    May 2017
    Location
    Sealand
    Posts
    15
    Thanks
    7
    Thanked 2 Times in 2 Posts
    Here's the source from a pull request made 14 days ago, which may fix the errors. Compile with -O3 switch for optimization.
    Attached Files Attached Files

  10. Thanks:

    vnx (20th September 2018)

  11. #40
    Member
    Join Date
    Apr 2009
    Location
    here
    Posts
    204
    Thanks
    172
    Thanked 110 Times in 66 Posts
    i can compile this in mingw with no errors/warnings, but it's useless, the decompressed files don't match the originals.

  12. Thanks:

    Chirantan (3rd November 2017)

  13. #41
    Member
    Join Date
    Feb 2018
    Location
    Italy
    Posts
    4
    Thanks
    2
    Thanked 4 Times in 2 Posts
    Hi,
    I've compiled in cygwin and mingw and it works; when decompressing it will restore the original file with original filename.
    Version 0.84 compresses better than 0.83, cygwin enwik9 compression log follows.
    Attached mingw x64 executable so you can test.
    Regards

    Code:
    $ time ./mcm.exe -x11 enwik9 enwik9.mcm_x11
    ======================================================================
    mcm compressor v0.84, by Mathieu Chartier (c)2016 Google Inc.
    Experimental, may contain bugs. Contact mathieu.a.chartier@gmail.com
    Special thanks to: Matt Mahoney, Stephan Busch, Christopher Mattern.
    ======================================================================
    Compressing to enwik9.mcm_x11 mode=max mem=11
    Enumerating files
    Enumerating took 0s
    Analyzing 1 files
    Analyzed 0 size=950.25MB 77460KB/s
    Analyzing took 12.608s
    
    
    (flist=10+blocks=113)=123 -> 76
    
    Compressing text block size=999,999,988
    Constructed dict words=40+9088+244332=279885 save=85862703+239149000+56372761=381384464 extra=0 time=0.437s
    Dictionary words=253460 size=2.20032MB
    
    976483KB -> 135304KB 1667KB/s ratio: 0.13856
    Compressed 999,999,988 -> 138,569,565 in 586.406s
    
    
    Escape 259585 word 4202283 first 32303383
    Compressing binary block size=12
    
    
    Compressed 12 -> 26 in 0.469s
    
    
    Word counter used 35632956 hash size 33554431
    Done compressing 1,000,000,000 -> 138,569,681 in 601s bpc=1.11
    
    real    10m1.978s
    user    9m56.140s
    sys     0m5.062s
    
    $ mv enwik9 enwik9.orig
    
    $ ls -l enwik9.*
    -rw-r--r-- 1 Vnx Vnx  138569681 Sep 20 19:25 enwik9.mcm_x11
    -rwxr-xr-x 1 Vnx Vnx 1000000000 Sep 20 19:15 enwik9.orig
    
    $ ./mcm.exe d enwik9.mcm_x11 test
    ======================================================================
    mcm compressor v0.84, by Mathieu Chartier (c)2016 Google Inc.
    Experimental, may contain bugs. Contact mathieu.a.chartier@gmail.com
    Special thanks to: Matt Mahoney, Stephan Busch, Christopher Mattern.
    ======================================================================
    Decompresing archive enwik9.mcm_x11
    Metadata size=123
    
    Decompressing text stream size=999,999,988
    
    Dictionary words=253460 size=2.20032MB
    976381KB <- 135300KB 1738KB/s ratio: 0.13857
    Decompressed 999,999,988 <- 138,569,565 in 561.61s
    
    
    Escape 0 word 0 first 0
    Decompressing binary stream size=12
    
    
    Decompressed 12 <- 26 in 0.453s
    
    
    $ ls -l enwik9* test
    ls: cannot access 'test': No such file or directory
    -rw-r--r-- 1 Vanex Vanex 1000000000 Sep 20 19:36 enwik9
    -rw-r--r-- 1 Vanex Vanex  138569681 Sep 20 19:25 enwik9.mcm_x11
    -rwxr-xr-x 1 Vanex Vanex 1000000000 Sep 20 19:15 enwik9.orig
    
    $ sha256sum.exe enwik9 enwik9.orig
    159b85351e5f76e60cbe32e04c677847a9ecba3adc79addab6f4c6c7aa3744bc *enwik9
    159b85351e5f76e60cbe32e04c677847a9ecba3adc79addab6f4c6c7aa3744bc *enwik9.orig
    Attached Files Attached Files

  14. Thanks (3):

    CompressMaster (19th November 2019),Darek (20th September 2018),Samantha (20th September 2018)

  15. #42
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,151
    Thanks
    703
    Thanked 455 Times in 352 Posts
    Looks impressive!
    Only 2MB worse score for pure enwik9 than paq8px series w/o precompressing and compression time is about 300x-400x shorter!

  16. #43
    Member
    Join Date
    Mar 2016
    Location
    USA
    Posts
    56
    Thanks
    7
    Thanked 23 Times in 15 Posts
    Here's a buildable version of master -- however, on the HTTP log files I use it on, this version sometimes fails to compress, and the ratio is actually worse on average than 0.83.

  17. #44
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,151
    Thanks
    703
    Thanked 455 Times in 352 Posts
    Here some scores for my testset for MCM v0.84.
    Best textual files method is -x3 for my files, best oversll method is -x6, however best scores for particular files are spreaded across all -x and -h methods. Is very fast, high performance.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	mcm_v84.jpg 
Views:	166 
Size:	1.55 MB 
ID:	6200  

  18. Thanks:

    Samantha (20th September 2018)

  19. #45
    Member
    Join Date
    Sep 2017
    Location
    Czech
    Posts
    4
    Thanks
    2
    Thanked 3 Times in 1 Post
    @vnx
    This binary crash on my Windows 7 x64.
    On another computer this behaves the same way.
    Do I need to install some libraries?

  20. #46
    Member
    Join Date
    Feb 2018
    Location
    Italy
    Posts
    4
    Thanks
    2
    Thanked 4 Times in 2 Posts
    Hi,
    i've compiled this one with less switches with:
    g++ -DNDEBUG -O3 -fomit-frame-pointer -std=gnu++11 -D_FILE_OFFSET_BITS=64 -o mcm_generic Archive.cpp Huffman.cpp MCM.cpp Memory.cpp Util.cpp Compressor.cpp File.cpp LZ.cpp Tests.cpp -lpthread -static -march=x86-64 -mtune=generic -s

    alternatively try compiling on your machine, just update line 73 in GD.hpp with:
    return f_.template Cost<Acc>(*this, inputs, actual);

    Regards
    Attached Files Attached Files

  21. #47
    Member
    Join Date
    Sep 2017
    Location
    Czech
    Posts
    4
    Thanks
    2
    Thanked 3 Times in 1 Post
    Thank you, works well.
    It seems, that -x12 parameter is no longer supported.

  22. #48
    Member CompressMaster's Avatar
    Join Date
    Jun 2018
    Location
    Lovinobana, Slovakia
    Posts
    189
    Thanks
    52
    Thanked 13 Times in 13 Posts
    I tested mcm 0.84 on enwik8 with maximal compression in mind. Results are quite impressive!
    But, can it be better on enwik8? i.e. achieve 14MB or so.
    Further, my tests on enwik8 with parameters "x8" requieres around 843 MB of memory. Could you make it lower (approx 100 MB) regardless decompression time while retaining the same size?

  23. #49
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,911
    Thanks
    291
    Thanked 1,272 Times in 719 Posts
    > But, can it be better on enwik8? i.e. achieve 14MB or so.

    You can probably improve enwik results by external preprocessing (DRT, cmix -s, xwrt, nncp preprocess).
    But not that much, or it'd win HutterPrize.

  24. #50
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    151
    Thanks
    50
    Thanked 42 Times in 31 Posts
    I do not think that DRT or xwrt will help. There is already a dict transform in mcm.

  25. #51
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,911
    Thanks
    291
    Thanked 1,272 Times in 719 Posts
    XWRT sure, and DRT itself is not much either, but DRT's dictionary is specially tuned to produce best results on enwik.
    The main idea of LIPT aka WRT/DRT is that words in text are replaced with word-like shorter codes
    (eg. base64 of dictionary index - word code can't contain spaces and non-text symbols, but otherwise can look random).
    But word order in DRT dictionary is optimized to produce similar prefixes _and_ suffixes for similar words,
    so matches are longer with it, and I don't expect any runtime dictionary preprocessor to beat it on enwik.

  26. #52
    Member
    Join Date
    Nov 2019
    Location
    Europe
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Bug in MCM x32 0.84

    MCM version 0.84 x32 generates files larger than the original and incompatible with the x64 version.
    Please fix it!

  27. #53
    Member
    Join Date
    May 2020
    Location
    Hebei Langfang
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I've used this wonderful function for a year and wonder who made it. And I find this, thank you. But, it's hard to use for normal people like me for it can only run on cmd and compress a file per operation. If somebody can design a GUi or remake a graphic software, it will be great.

  28. #54
    Member
    Join Date
    May 2017
    Location
    Hungary
    Posts
    10
    Thanks
    49
    Thanked 3 Times in 3 Posts
    Maybe you can integrate it into Bulat's FreeARC...

Page 2 of 2 FirstFirst 12

Similar Threads

  1. MCM open source
    By Mat Chartier in forum Data Compression
    Replies: 12
    Last Post: 29th August 2013, 20:22
  2. Yet another LZP/PPM compressor
    By RichSelian in forum Data Compression
    Replies: 3
    Last Post: 1st August 2013, 17:21
  3. TinyLZP - A very simple LZP compressor
    By david_werecat in forum Data Compression
    Replies: 8
    Last Post: 15th October 2012, 03:05
  4. lzp question
    By sourena in forum Data Compression
    Replies: 4
    Last Post: 5th February 2012, 17:24
  5. flzp, new LZP compressor/preprocessor
    By Matt Mahoney in forum Data Compression
    Replies: 13
    Last Post: 23rd June 2008, 17:24

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •