Page 15 of 15 FirstFirst ... 5131415
Results 421 to 450 of 450

Thread: cmix

  1. #421
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,135
    Thanks
    320
    Thanked 1,397 Times in 802 Posts
    100000000/126675.67 = 789 bytes/s. But that's still 10x faster than NNCP :)

  2. #422
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    252
    Thanks
    116
    Thanked 124 Times in 72 Posts
    Quote Originally Posted by encode View Post
    Results on my PC:
    CPU: Intel Core i5-9600K @ 4.8 GHz
    MoBo: MSI Z390 GAMING EDGE
    RAM: HyperX Predator 32 GB (2x16) 3200 MHz CL16
    Storage: Samsung 970 EVO Plus 250 GB M.2 NVMe SSD
    Code:
    C:\cmix>timer cmix -c enwik8 enwik8.cmix18
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    100000000 bytes -> 15384933 bytes in 126675.67 s.
    cross entropy: 1.231
    
    Kernel Time  =    12.140 = 00:00:12.140 =   0%
    User Time    = 126660.828 = 35:11:00.828 =  99%
    Process Time = 126672.968 = 35:11:12.968 =  99%
    Global Time  = 126675.921 = 35:11:15.921 = 100%
    ENWIK9 results will take some time...
    I recommend enabling dictionary preprocessing to improve compression rate and compression time: timer cmix -c dictionary/english.dic enwik8 enwik8.cmix18
    Compiling cmix yourself will also produce a faster executable. For cmix v18 on my computer, enwik8 compresses to 14838332 in 57508 seconds.

  3. #423
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,272
    Thanks
    802
    Thanked 545 Times in 415 Posts
    >Compiling cmix yourself will also produce a faster executable. For cmix v18 on my computer, enwik8 compresses to 14838332 in 57508 seconds.
    @Byron -> how I can compile cimix by myself? Is there any fast and easy comiler to use for it? I'm a lamer in this kind of topics.

  4. #424
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,135
    Thanks
    320
    Thanked 1,397 Times in 802 Posts
    For windows you can get gcc/mingw: https://sourceforge.net/projects/min...onal%20Builds/
    Then compile it like this: https://encode.su/threads/1925-cmix?...ll=1#post62052
    There would be more compiler options for best speed, though.

  5. Thanks:

    Darek (4th January 2020)

  6. #425
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    252
    Thanks
    116
    Thanked 124 Times in 72 Posts
    Quote Originally Posted by Darek View Post
    >Compiling cmix yourself will also produce a faster executable. For cmix v18 on my computer, enwik8 compresses to 14838332 in 57508 seconds.
    @Byron -> how I can compile cimix by myself? Is there any fast and easy comiler to use for it? I'm a lamer in this kind of topics.
    In Linux, just run "make". In Windows it is a bit more difficult. You can try either MinGW (http://nuwen.net/mingw.html) or Cygwin (https://www.cygwin.com) and then run "make".

  7. Thanks:

    Darek (4th January 2020)

  8. #426
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,135
    Thanks
    320
    Thanked 1,397 Times in 802 Posts
    For "make" its better to get msys: http://www.msys2.org/ - its the only system with a working package manager.
    Mingw distributions frequently don't have any make at all, or it doesn't work.
    And cygwin requires manually selecting some packages in GUI setup - you have to know what you need when installing it.

    So I think the best method is mingw + use a list file for g++ @list, like I suggested before. Make is not needed to build cmix.

  9. #427
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,023
    Thanks
    415
    Thanked 416 Times in 158 Posts
    It's an executable from your webpage. Now I'm running the ENWIK9 compression - I think it will take more than 10 days. Will test the dictiobary right after!

  10. #428
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    178
    Thanks
    61
    Thanked 51 Times in 40 Posts
    Mingw-w64 (x86_64-8.1.0-posix-seh-rt_v6-rev0) works well for me. Just replace clang++ with g++ in the makefile and run "mingw32-make".

  11. Thanks:

    Shelwien (3rd January 2020)

  12. #429
    Member
    Join Date
    Apr 2018
    Location
    Indonesia
    Posts
    85
    Thanks
    21
    Thanked 5 Times in 5 Posts
    Quote Originally Posted by byronknoll View Post
    I recommend enabling dictionary preprocessing to improve compression rate and compression time: timer cmix -c dictionary/english.dic enwik8 enwik8.cmix18
    Compiling cmix yourself will also produce a faster executable. For cmix v18 on my computer, enwik8 compresses to 14838332 in 57508 seconds.

    The different between encode and uour reseult is 546601 bytes. is it possible to reach that by use dictionary preprocessing ? i gues although use dictionary the result still different.

  13. #430
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,023
    Thanks
    415
    Thanked 416 Times in 158 Posts
    ENWIK9 compression's running for 6 days = 34% complete. Has anyone rechecked the results ever?


  14. #431
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    Quote Originally Posted by encode View Post
    ENWIK9 compression's running for 6 days = 34% complete. Has anyone rechecked the results ever?

    Has anyone rechecked the nncp result too ?

  15. #432
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,023
    Thanks
    415
    Thanked 416 Times in 158 Posts
    Results with a slightly higher overclock:
    i5-9600K @ 4.9 GHz all cores. 5.0 GHz and above requires much higher voltage which is not safe for 24/7 running.
    Code:
    C:\cmix>timer cmix -c dictionary/english.dic enwik8 enwik8.cmix18
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    100000000 bytes -> 14847703 bytes in 76734.24 s.
    cross entropy: 1.188
    
    Kernel Time  =    15.625 = 00:00:15.625 =   0%
    User Time    = 76713.015 = 21:18:33.015 =  99%
    Process Time = 76728.640 = 21:18:48.640 =  99%
    Global Time  = 76736.093 = 21:18:56.093 = 100%
    And I don't believe your compression/decompression timings. Or it is an extremely poor EXE compile at your website - please recompile and update the page!

  16. #433
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,135
    Thanks
    320
    Thanked 1,397 Times in 802 Posts
    cmix dynamically allocates lots of small memory blocks (mostly for LSTM model),
    its known to have a large effect on speed and memory consumption depending on OS and compiler.
    Also there could be a difference in vector extensions used.

  17. #434
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    252
    Thanks
    116
    Thanked 124 Times in 72 Posts
    Quote Originally Posted by encode View Post
    And I don't believe your compression/decompression timings. Or it is an extremely poor EXE compile at your website - please recompile and update the page!
    The executable on the cmix page was compiled without "-march=native" to improve compatibility between computers. If you compile your own executable with "-Ofast -march=native" it will be significantly faster. I think the main reason is auto-vectorization and SIMD. The cmix benchmarks I ran were not with the public executable, but with one compiled with "-march=native".

  18. #435
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,023
    Thanks
    415
    Thanked 416 Times in 158 Posts
    Results with a compile by Shelwien (Thank you!)
    Code:
    C:\cmix>timer cmix -c dictionary/english.dic enwik8 enwik8.cmix18
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    100000000 bytes -> 14846066 bytes in 63749.76 s.
    cross entropy: 1.188
    
    Kernel Time  =    13.484 = 00:00:13.484 =   0%
    User Time    = 63731.453 = 17:42:11.453 =  99%
    Process Time = 63744.937 = 17:42:24.937 =  99%
    Global Time  = 63751.796 = 17:42:31.796 = 100%

  19. #436
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,040
    Thanks
    104
    Thanked 420 Times in 293 Posts
    cmix (v18 ) -c english.dic enwik9:
    115,739,547 bytes, 756,510.756 sec. (8.76 days, 25GB memory use, cross entropy 0.926)

  20. #437
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    142
    Thanks
    28
    Thanked 95 Times in 32 Posts
    Quote Originally Posted by Sportman View Post
    cmix (v18 ) -c english.dic enwik9:
    115,739,547 bytes, 756,510.756 sec. (8.76 days, 25GB memory use, cross entropy 0.926)
    8.76 days ... think about that (for 8.76 days).

  21. #438
    Member
    Join Date
    Nov 2019
    Location
    Malaysia
    Posts
    3
    Thanks
    1
    Thanked 0 Times in 0 Posts
    [QUOTE=Sportman;64022]cmix (v18 ) -c english.dic enwik9:
    115,739,547 bytes, 756,510.756 sec. (8.76 days, 25GB memory use, cross entropy 0.926)[/Q


    For enwik10 it needs 87.6 days (~3 months)

  22. #439
    Member
    Join Date
    Apr 2018
    Location
    Indonesia
    Posts
    85
    Thanks
    21
    Thanked 5 Times in 5 Posts
    [QUOTE=lzhuff;64042]
    Quote Originally Posted by Sportman View Post
    cmix (v18 ) -c english.dic enwik9:
    115,739,547 bytes, 756,510.756 sec. (8.76 days, 25GB memory use, cross entropy 0.926)[/Q


    For enwik10 it needs 87.6 days (~3 months)

  23. #440
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    [QUOTE=lzhuff;64042]
    Quote Originally Posted by Sportman View Post
    cmix (v18 ) -c english.dic enwik9:
    115,739,547 bytes, 756,510.756 sec. (8.76 days, 25GB memory use, cross entropy 0.926)[/Q




    For enwik10 it needs 87.6 days (~3 months)

    Which part of cmix that reduced can be run under 8gb ram ? Can reduced paq n paq8h level from 11 to six can run under 8gb ram ?

  24. #441
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    Quote Originally Posted by Sportman View Post
    cmix (v18 ) -c english.dic enwik9:
    115,739,547 bytes, 756,510.756 sec. (8.76 days, 25GB memory use, cross entropy 0.926)

    @sportman could you test the decompression using cmix ? I have tested cmix v17 with reduced paq8hp n paq8 level to 4 and the checksum value don't match

  25. #442
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    @sportman If you want test the decompression function of cmix, use small file, it does not take very long time

  26. #443
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    i have tested cmix17 on wrtpre.cpp and the hash value after decompression is not match with the original file. why ???
    this is the hash value of wrtpre.cpp 3FE3BD3E77A2A34869EC12FD77491EF9D0192BFA
    i attach the source code and the binary of cmix17 and compiled it using dev c++
    Attached Files Attached Files

  27. #444
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    252
    Thanks
    116
    Thanked 124 Times in 72 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    i have tested cmix17 on wrtpre.cpp and the hash value after decompression is not match with the original file. why ???
    this is the hash value of wrtpre.cpp 3FE3BD3E77A2A34869EC12FD77491EF9D0192BFA
    i attach the source code and the binary of cmix17 and compiled it using dev c++
    I might be able to help debug this. Did you make any modifications to cmix v17? Did you use a dictionary when compressing and decompressing? Can you post a copy of wrtpre.cpp so I can try reproducing the problem on my computer?

  28. #445
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    Quote Originally Posted by byronknoll View Post
    I might be able to help debug this. Did you make any modifications to cmix v17? Did you use a dictionary when compressing and decompressing? Can you post a copy of wrtpre.cpp so I can try reproducing the problem on my computer?
    i just setting paq8 and paq8hp to level 6 on cmix17. i do not use dictionary when compress and decompress. here is wrtpre.cpp
    Attached Files Attached Files

  29. #446
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    252
    Thanks
    116
    Thanked 124 Times in 72 Posts
    It works for me. Here is a Colab where you can see it compress+decompress to the same md5: https://colab.research.google.com/dr...eX-ZSjMgF29Ci7

  30. #447
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    Quote Originally Posted by byronknoll View Post
    It works for me. Here is a Colab where you can see it compress+decompress to the same md5: https://colab.research.google.com/dr...eX-ZSjMgF29Ci7
    How to use colab ?

  31. #448
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    How to use colab ?

    Do you use my binary compiled with Dev c++ ??

  32. #449
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    514
    Thanks
    63
    Thanked 96 Times in 75 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    Do you use my binary compiled with Dev c++ ??

    my assumption is you have compiled the source code on linux...the question is why the same source code compiled with different compiler can cause the decompress file corrupted

  33. #450
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    252
    Thanks
    116
    Thanked 124 Times in 72 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    my assumption is you have compiled the source code on linux...the question is why the same source code compiled with different compiler can cause the decompress file corrupted
    I didn't use your binary. Here are some suggestions that might help:
    - change the compiler flag from -Ofast to -O3. The binary will be slower, but might fix the issue you are seeing.
    - upgrade your compiler to a more recent version.
    - change to a different compiler - I recommend clang.

Page 15 of 15 FirstFirst ... 5131415

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •