Page 1 of 2 12 LastLast
Results 1 to 30 of 42

Thread: bigm file compressor

  1. #1
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts

    bigm file compressor

    may someone on this forum run it for enwik8 and enwik9 or darek benchmark ?? this is use neural network so the running time can so long.
    thanx you very much
    Attached Files Attached Files

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,065
    Thanks
    310
    Thanked 1,360 Times in 777 Posts

  3. Thanks:

    JamesB (19th November 2019)

  4. #3
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by Shelwien View Post
    No.

  5. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,065
    Thanks
    310
    Thanked 1,360 Times in 777 Posts
    What "no"? I looked at the debuginfo in it, and its got all the cmix classes
    Code:
     0001:00000500       Manager::Manager(void)
     0001:0000088E       Manager::UpdateHistory(void)
     0001:000008D0       Manager::UpdateWords(void)
     0001:00000A36       Manager::UpdateRecentBytes(void)
     0001:00000A68       Manager::Perceive(int)
     0001:00000B9C       Manager::AddContext(std::unique_ptr<Context,std::default_delete<Context>>)
     0001:00000C36       Manager::AddBitContext(std::unique_ptr<BitContext,std::default_delete<BitContext>>)
     0001:00000C90       Predictor::GetNumNeurons(void)
     0001:00000D46       Predictor::Predict(void)
     0001:00000FE4       Predictor::Perceive(int)
     0001:000011D0       Predictor::Add(Model *)
     0001:00001290       Predictor::AddDMC(void)
     0001:0000139C       Predictor::AddByteRun(void)
     0001:000019C8       Predictor::AddRunMap(void)
     0001:00002004       Predictor::AddNonstationary(void)
     0001:00002728       Predictor::AddMatch(void)
     0001:00002FA8       Predictor::AddDoubleIndirect(void)
     0001:0000386A       Predictor::AddDirect(void)
     0001:00003F1C       Predictor::AddSparse(void)
     0001:000051B2       Predictor::AddEnglish(void)
     0001:000065C4       Predictor::AddByteModel(ByteModel *)
     0001:00006686       Predictor::AddPPM(void)
     0001:000067A6       Predictor::Add(int,Mixer *)
     0001:000068CC       Predictor::AddPAQ8L(void)
     0001:00006A16       Predictor::AddPAQ8HP(void)
     0001:00006B62       Predictor::AddPAQ8H2(void)
     0001:00006CAC       Predictor::AddSSE(void)
     0001:000073BE       Predictor::AddMixers(void)
     0001:0000A3D4       Predictor::Predictor(void)
    https://github.com/byronknoll/cmix/b...-manager.h#L13
    https://github.com/byronknoll/cmix/b...redictor.h#L26

  6. Thanks:

    schnaader (19th November 2019)

  7. #5
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by Shelwien View Post
    What "no"? I looked at the debuginfo in it, and its got all the cmix classes
    Code:
     0001:00000500       Manager::Manager(void)
     0001:0000088E       Manager::UpdateHistory(void)
     0001:000008D0       Manager::UpdateWords(void)
     0001:00000A36       Manager::UpdateRecentBytes(void)
     0001:00000A68       Manager::Perceive(int)
     0001:00000B9C       Manager::AddContext(std::unique_ptr<Context,std::default_delete<Context>>)
     0001:00000C36       Manager::AddBitContext(std::unique_ptr<BitContext,std::default_delete<BitContext>>)
     0001:00000C90       Predictor::GetNumNeurons(void)
     0001:00000D46       Predictor::Predict(void)
     0001:00000FE4       Predictor::Perceive(int)
     0001:000011D0       Predictor::Add(Model *)
     0001:00001290       Predictor::AddDMC(void)
     0001:0000139C       Predictor::AddByteRun(void)
     0001:000019C8       Predictor::AddRunMap(void)
     0001:00002004       Predictor::AddNonstationary(void)
     0001:00002728       Predictor::AddMatch(void)
     0001:00002FA8       Predictor::AddDoubleIndirect(void)
     0001:0000386A       Predictor::AddDirect(void)
     0001:00003F1C       Predictor::AddSparse(void)
     0001:000051B2       Predictor::AddEnglish(void)
     0001:000065C4       Predictor::AddByteModel(ByteModel *)
     0001:00006686       Predictor::AddPPM(void)
     0001:000067A6       Predictor::Add(int,Mixer *)
     0001:000068CC       Predictor::AddPAQ8L(void)
     0001:00006A16       Predictor::AddPAQ8HP(void)
     0001:00006B62       Predictor::AddPAQ8H2(void)
     0001:00006CAC       Predictor::AddSSE(void)
     0001:000073BE       Predictor::AddMixers(void)
     0001:0000A3D4       Predictor::Predictor(void)
    https://github.com/byronknoll/cmix/b...-manager.h#L13
    https://github.com/byronknoll/cmix/b...redictor.h#L26
    Just a name but the content is different. You can check the SHA.is it has the same sha value or not

  8. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,583
    Thanks
    797
    Thanked 691 Times in 374 Posts
    lzturbo reincarnation

  9. #7
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    521
    Thanks
    196
    Thanked 186 Times in 127 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    Just a name but the content is different. You can check the SHA.is it has the same sha value or not
    It's possible it was derived from cmix but is still substantially different, but the evidence of all those names is that it needs to be open source given cmix is GPL.

    Can we please get a copy of the source-code therefore so people can judge how it compares, and also compile it up without the potential virus.

  10. #8
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,065
    Thanks
    310
    Thanked 1,360 Times in 777 Posts
    There's no virus, so it can be tested and compared with cmix results.
    I don't think its the most recent cmix version.

  11. #9
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,583
    Thanks
    797
    Thanked 691 Times in 374 Posts
    scanning for known viruses can't detect all trojan horses, f.e. program deleting all disk files

  12. #10
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,065
    Thanks
    310
    Thanked 1,360 Times in 777 Posts
    I actually looked at decompiler output to check that, just found cmix instead.
    Of course I can't give 100% guarantee, but there's nothing obvious, and I think that
    the person able to hide a trojan from me would be at least able to build cmix with -s after replacing the name.

  13. Thanks (3):

    Gotty (26th November 2019),introspec (26th November 2019),JamesB (26th November 2019)

  14. #11
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by Shelwien View Post
    I actually looked at decompiler output to check that, just found cmix instead.
    Of course I can't give 100% guarantee, but there's nothing obvious, and I think that
    the person able to hide a trojan from me would be at least able to build cmix with -s after replacing the name.

    this is bigm_suryak v2:
    improved word model by new hash function
    this file contain source and binary too.
    please Darek or someone test it and give the result. it uses <=32Gb memory usage
    Attached Files Attached Files

  15. #12
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Switzerland
    Posts
    676
    Thanks
    402
    Thanked 449 Times in 234 Posts
    suryakandau@yahoo.co.id,
    Someone may help you out here (not me - I have only 8 gigs), but you are asking for significant resources.
    Please consider going shopping for some RAM so that you can try and verify your ideas. This is essential if you are seriously thinking about improving cmix. It will not go easily if you need to wait for someone for days or weeks.

    Most importantly, please respect the work of others especially if you are planning standing on their shoulders.

  16. #13
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,065
    Thanks
    310
    Thanked 1,360 Times in 777 Posts
    I actually posted this specifically for him: https://encode.su/threads/3242-googl...ession-testing

    Also he really did some context tweaking in paq8pxd... presuming that this is the same thing, I'm kinda okay with renaming,
    since its better than having a paq8pxd47_bwtXX branch. Just have to be clear with authorship.

  17. Thanks:

    Gotty (26th November 2019)

  18. #14
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    527
    Thanks
    218
    Thanked 376 Times in 197 Posts
    It has 3 new or modified files, i assume in cmix version x from 2015.
    2 new paq8(l,hp) models with reduced memory. Hash functions and 9 context in wordmodel.
    I think it will be slow. To bad there isin't SIMD code in mixers. Still it has some SIMD in it, probably auto from compiler. Checked only paq8l.o.
    KZo


  19. Thanks:

    Gotty (26th November 2019)

  20. #15
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by Shelwien View Post
    I actually posted this specifically for him: https://encode.su/threads/3242-googl...ession-testing

    Also he really did some context tweaking in paq8pxd... presuming that this is the same thing, I'm kinda okay with renaming,
    since its better than having a paq8pxd47_bwtXX branch. Just have to be clear with authorship.
    Paq8pxd branch from pa8px69 and then start with new thread.

  21. #16
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    bigm_suryak v3
    enwik8 17,858,121 bytes
    ​compression time 132776.36 s
    Attached Files Attached Files

  22. #17
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    bigm_suryak v3 using level 6. i do not try with level 9 or 10 because my laptop has only 16Gb.
    bigm_suryak v4 on progress

  23. #18
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    bigm_suryak v3
    enwik8 17,858,121 bytes
    ​compression time 132776.36 s
    It only takes ~1.3 Gb memory

  24. #19
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    It only takes ~1.3 Gb memory
    XML.tar from Silesia benchmark without Precomp and memory only ~1.3 Gb the result is 265136 byte

  25. #20
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    XML.tar from Silesia benchmark without Precomp and memory only ~1.3 Gb the result is 265136 byte

    bigm_suryak v8
    - improve word model
    -use only 1.3 gb memory
    xml file from silesia benchmark
    without precomp: 264725 bytes
    Attached Files Attached Files

  26. #21
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    249
    Thanks
    114
    Thanked 123 Times in 72 Posts
    @suryakandau, please distribute the source code of bigm along with the releases. Since cmix code is GPL, and bigm contains cmix code, bigm also needs to be open source.

  27. Thanks (4):

    hexagone (10th December 2019),Mike (10th December 2019),moisesmcardona (12th December 2019),Shelwien (10th December 2019)

  28. #22
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    bigm_suryak v8
    - improve word model
    -use only 1.3 gb memory
    xml file from silesia benchmark
    without precomp: 264725 bytes
    enwik8 17809514 bytes using ~1.3 gb memory

  29. #23
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    bigm_suryak v9
    improve wordmodel
    xml file from silesia benchmark 264409 bytes using ~1.3 gb memory
    this is archive file contain source code and binary
    Attached Files Attached Files

  30. #24
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    bigm_suryak v9
    improve wordmodel
    xml file from silesia benchmark 264409 bytes using ~1.3 gb memory
    this is archive file contain source code and binary
    Enwik8 17801098 bytes

  31. #25
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by byronknoll View Post
    @suryakandau, please distribute the source code of bigm along with the releases. Since cmix code is GPL, and bigm contains cmix code, bigm also needs to be open source.


    @byron may you run bigm using paq8hp(11) paq8h2(11) paq8l(10) on enwik8 ? I just want to know how much enwik8 can be compressed ...thank you

  32. #26
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    bigm_suryak v9.1
    xml 264149 bytes
    Attached Files Attached Files

  33. #27
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    bigm_suryak v9.2
    - improve word model
    - improve nest model

    xml (silesia benchmark) 263672 bytes
    astramina.fna 361269 bytes
    enwik8 on progress..



    Attached Files Attached Files

  34. #28
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    bigm_suryak v9.2
    -
    improve word model
    - improve nest model

    xml (silesia benchmark) 263672 bytes
    astramina.fna 361269 bytes
    enwik8 on progress..



    enwik8 17661844 bytes using only ~1.3 gb memory
    enwik9 on progress...maybe the result ~13x.xxx.xxx bytes

    this is the source code of bigm_suryak v9.2
    Attached Files Attached Files
    Last edited by suryakandau@yahoo.co.id; 19th December 2019 at 17:16.

  35. #29
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    enwik8 17661844 bytes using only ~1.3 gb memory
    enwik9 on progress...maybe the result ~13x.xxx.xxx bytes

    this is the source code of bigm_suryak v9.2

    Bigm_suryak v9.4

    XML from Silesia benchmark 263571 bytes without using precomp n memory only ~1.3gb

  36. #30
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    494
    Thanks
    62
    Thanked 90 Times in 70 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    Bigm_suryak v9.4

    XML from Silesia benchmark 263571 bytes without using precomp n memory only ~1.3gb

    Bigm_suryak v9.4.1
    xml from silesia benchmark 263522 bytes without using precomp and memory only 1.3gb
    Attached Files Attached Files

Page 1 of 2 12 LastLast

Similar Threads

  1. semut file compressor
    By suryakandau@yahoo.co.id in forum Data Compression
    Replies: 1
    Last Post: 1st September 2015, 10:25
  2. Kitty file compressor (Super small compressor)
    By snowcat in forum Data Compression
    Replies: 7
    Last Post: 26th April 2015, 17:46
  3. Fpaq0pv3 file Compressor
    By Nania Francesco in forum Forum Archive
    Replies: 13
    Last Post: 7th April 2008, 18:20
  4. CCM file compressor
    By LovePimple in forum Forum Archive
    Replies: 54
    Last Post: 22nd February 2007, 02:13
  5. FPAQ file compressor
    By LovePimple in forum Forum Archive
    Replies: 5
    Last Post: 2nd November 2006, 00:48

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •