Page 2 of 3 FirstFirst 123 LastLast
Results 31 to 60 of 79

Thread: BCM v0.08 - The ultimate BWT-based file compressor!

  1. #31
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    136
    Thanks
    20
    Thanked 93 Times in 30 Posts
    Quote Originally Posted by encode View Post
    [*]Invented new conception in BWT-output encoding. Still no cheating or tricks like M03
    I just noticed this line. How is it a "trick" or "cheating" to get full order context modeling of the BWT for encoding?

    M03 doesn't involve any preprocessing, alphabet reordering, RLE or secondary transforms or any filtering of any kind. It's basically optimal parsing of the BWT string along all context boundaries for each context order and nothing more. Hardly "cheating" ...

  2. #32
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    At the first sight M03 looks like CM with BWT-structures to access the statistics...

  3. #33
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    The compressor is featured in the BWT comparison.

  4. #34
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Currently I'm working under BCM v0.09. And again have many proofs that proper CM is supreme to anything. As example, order-0 CM works better than MTF+order-0 CM. Yep, MTF may help with dummy coders, but with even simple CM it only hurts. MTF stage makes data more stationary, but removes some correlations and adds some noise thus compression with CM degrades very seriously. So, I think the future of BWT-output coding is more advanced CM coders. Furtunately, we have unlimited ideas and ways here. As example, I invented a very special model that is derived from ICM idea. And such standalone model provides 213,886 bytes on book1!!! And it's close to the BBB result, being super-fast, because the complexity of this model is equal to two[!] counters! I'm still under heavy experiments and very creative ideas about BWT-output encoding... Additionally, I'm playing with different contexts for CM and various ideas such as QLFC...

  5. #35
    Member Yuri Grille.'s Avatar
    Join Date
    Mar 2009
    Location
    ****
    Posts
    35
    Thanks
    0
    Thanked 1 Time in 1 Post

    Hi encode.su , you can download a very small BCM 0.8

    This is a optimized bcm in only (48.7 KB)

    [removed]

    Enjoy this good compressor

  6. #36
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    I think that a difference in a few KB not worth such slow launch... Better I will release a small size-optimized compile of the BCM... Anyway, I always prefer to keep the ONE speed-optimized compile of the compressor!

  7. #37
    Member Yuri Grille.'s Avatar
    Join Date
    Mar 2009
    Location
    ****
    Posts
    35
    Thanks
    0
    Thanked 1 Time in 1 Post

    Yes encode , you are rigth !!

    Quote Originally Posted by encode View Post
    I think that a difference in a few KB not worth such slow launch... Better I will release a small size-optimized compile of the BCM... Anyway, I always prefer to keep the ONE speed-optimized compile of the compressor!

    Yes ,this is true , I make this with the purpose to make a very small size to you send this to the benchmarks , when you open this .exe is a liter slow. Sorry my bad English

  8. #38
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    688
    Thanks
    41
    Thanked 173 Times in 88 Posts
    Quote Originally Posted by Yuri Grille. View Post
    This is a optimized bcm in only (48.7 KB)
    I wonder what packer and manipulations it made with. Surely its kkrunchy. But I was able to shrink it down to 51 200 bytes only with latest kkrunchy v0.23a4_asm07 so I think some other things have been done also. Yuri, can you give a comment on it please ?

  9. #39
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts

  10. #40
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    It's the same sad story as with UPack...
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  11. #41
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Anyway, I decided to prohibit such things. We should use a stable, free and clean EXE-packers, such as UPX.
    Each version of BCM must have ONE executable, without slow "re-packs" and hacks. All versions should be provided by the author.
    If someone will spread another versions of BCM or any of my work, that user will be banned...

  12. #42
    Member
    Join Date
    Aug 2008
    Location
    Saint Petersburg, Russia
    Posts
    215
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Why At which point encode.su has become a refuge of reliable and stable software?

    I'm not being offensive, I'm just curious

  13. #43
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Quote Originally Posted by nanoflooder View Post
    Why At which point encode.su has become a refuge of reliable and stable software?

    I'm not being offensive, I'm just curious
    My programs are reliable and stable. It's the prime target - the QUALITY of my work. The quality of all details is on the first place. Very carefully design and testing of a code and a program... I'm extremely pedantic here. And the main thing, programs should not be detected as a threats by AVs.

  14. #44
    Member Yuri Grille.'s Avatar
    Join Date
    Mar 2009
    Location
    ****
    Posts
    35
    Thanks
    0
    Thanked 1 Time in 1 Post
    [the message accidentally has been deleted, sorry]

    The reason to use upx is because the mostly Antivirus can unpack this.

  15. #45
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Quote Originally Posted by Yuri Grille. View Post
    The reason to use upx is because the mostly Antivirus can unpack this.
    Additionally, and the most important thing, UPX has official unpacking...

  16. #46
    Member Yuri Grille.'s Avatar
    Join Date
    Mar 2009
    Location
    ****
    Posts
    35
    Thanks
    0
    Thanked 1 Time in 1 Post

    Talking

    Why you delete my post "accidentally" , this is a free forum ??.

    I'm not steeling you software only I make a optimization to less some kb, please don't delete my post.

    THis is some like original post:
    -----------------------------------------------------------------------
    I am using the secret version kkrunchy v0.25 , you don't have this version because I'm the creator of this software . Jajajajaj

    No that is false , this optimization is a manual optimization of the program with a hex editor and Olly debugger , always the compilers of high level like the BCM put unnecessary data on the .exe , you can replace this data with null data to make "more easy" to the .exe packer , this can be a easy work or difficult on dependency of the program , if you need I can make a manual "How to optimize .exe" , but I need the help of somebody to help me with the English , because I speack spanish and I don't have a advance knowledge of the good English.

    http://sites.google.com/site/compactamos/descargas

    etc......

  17. #47
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Accidentally means accidentally!

    I have a Spanish translator... If you will post rude messages, you know what will happen!

  18. #48
    Member Yuri Grille.'s Avatar
    Join Date
    Mar 2009
    Location
    ****
    Posts
    35
    Thanks
    0
    Thanked 1 Time in 1 Post

    Ok man.

    Sorry me ,but don't delete my "no rude post".

  19. #49
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Finally, new BCM beats M03 at ENWIK5!
    Actually, using ICM with SSE is questionable. So, new BCM have no ICM models. Instead, I added a few SSE stages...

  20. #50
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    325
    Thanks
    18
    Thanked 6 Times in 5 Posts
    Quote Originally Posted by Yuri Grille. View Post
    Why you delete my post "accidentally" , this is a free forum ??.

    I'm not steeling you software only I make a optimization to less some kb, please don't delete my post.

    THis is some like original post:
    -----------------------------------------------------------------------
    I am using the secret version kkrunchy v0.25 , you don't have this version because I'm the creator of this software . Jajajajaj
    If you are saying you are the author of kkrunchy, then stop talking crap, i know who the author of kkrunchy is, and you are not him. Just like you claimed back in the psd thread you were the creator of a method for "shrinking" psd files although it had already been around for many many years.

  21. #51
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,531
    Thanks
    755
    Thanked 674 Times in 365 Posts
    Intrinsic, he is joking - read entire post!

    and psd trick is so simple that he should find it independently

  22. #52
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    With a new approach, ~209100 bytes on book1 is easily possible. Anyway, my goal is ~208xxx bytes...

  23. #53
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Yes!
    I've got ~208xxx bytes on book!



  24. #54
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    This time the compression gain will be much bigger (v0.08 -> v0.09) than previously (v0.07 -> v0.0. Unfortunately, the cost of an extra compression is an extra processing time. Anyway, new BCM is just a next level...

  25. #55
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts

    Cool

    BCM v0.09 has been written. And it's HARDCORE! Apart from much improved CM, I did many very important optimizations, some parts of the code was completely rewritten. Anyway, even with such extreme optimizations new BCM is notable slower. Personally, I think that such extra and record breaking BWT compression worth it. I just need to do some additional tests and optimizations to get its full potential. Also, I think that BCM v0.09 will be kind of a final version. Or at least I will do some break in its development.

    OK, here are some results of the release candidate (plus previous results, for comparison):
    book1 -> 208,863 bytes (209,826 bytes)
    calgary.tar -> 775,924 bytes (779,619 bytes)
    world95.txt -> 463,939 bytes (466,250 bytes)
    fp.log -> 548,714 bytes (557,226 bytes)
    ENWIK8 -> 20,638,938 bytes (20,744,613 bytes)
    bible.txt -> 721,809 bytes (725,091 bytes)
    3200.txt-> 3,644,158 bytes (3,660,195 bytes)


  26. #56
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,693
    Thanks
    267
    Thanked 1,180 Times in 651 Posts
    I just checked and bwmonstr's result on book1
    with alphabet reordering is 203928.
    And 205397 by default.

    Did you finally believe that its not really possible to go under 208k
    with standard BWT and postcoding, or what?

  27. #57
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Well, I think that it is possible to get ~207xxx bytes on book1 using a standard BWT and direct coding (with no alphabet reordering or transformations of any kind). My CM already can produce about 208400 bytes on book1. It is complex enough, but I easily can imagine something a far more complex. The problem is, using an extremely complex CM will kill any BWT benefits - better use pure CM in this case...
    For sure, with direct coding ~205xxx on book1 is not really possible.
    Don't know what the heck BWMonstr is, Sami keep silence. Anyway, it's too way slow. M03 is OK, but again it's too slow, and furthermore, new BCM v0.09 both faster and has a higher compression than M03...

  28. #58
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts

    Cool

    Shelwien changed the direction of the further BCM development. Of course I always may add some extra models and dynamic mixing to beat BWTmix... But it will be SLOW! So, what is the goal? Overloaded BWT with a PAQ speed? Nope! Now I think to make BCM more lighter and faster. Since it's not the strongest anymore, I will do it really practical. I already did some serious and fundamental speed/structure optimization with my unreleased BCM... Anyway, the idea is to keep the most important models with a static mixing plus one SSE. Heavy versions may have lots of models with dynamic mixing, many SSE stages... It's cool, but again better use just a CM - we loose all of the advantages of BWT. At the same time, experience with such over-weighed CM-back-ends as unreleased BCM009 and BWTmix really help to find out what kind of models/techniques are the best and how close the lighter CM is...

  29. #59
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,693
    Thanks
    267
    Thanked 1,180 Times in 651 Posts
    > Of course I always may add some extra models and dynamic mixing to beat BWTmix...
    > But it will be SLOW!

    Not necessarily. Well, it might be, if you'd really just _add_ something.
    But current BWTmix model is really simple, and doesn't use SSE or fsm
    counters, so there're certainly ways to beat it.

    > keep the most important models with a static mixing plus one SSE

    Don't forget about fsm. Basically, it should be possible to approximate
    a few counters (maybe even with some small contexts) + dynamic mixing + SSE
    by a pair of lookup tables (p=P[state] and state=update[state][bit])
    Well, with increasing original state size (all counters, mixer and SSE states)
    it becomes harder to enumerate all submodel states and determine which are
    relevant, but that's what makes it interesting
    And a pair of counters with static mixing is an obvious target for that.

    > At the same time, experience with such over-weighed
    > CM-back-ends as unreleased BCM009 and BWTmix really help
    > to find out what kind of models/techniques are the best
    > and how close the lighter CM is...

    Well, as I was always saying... its just rational to get as
    much compression as possible - basically to estimate
    the data redundancy, and only after that start to optimize
    on speed, knowing the cost of compression you're losing.

    And BWTmix is not really "overweighted" - its just a very
    straightforward implementation, which probably would be
    able to reach the speed of bcm008 if optimized.

    But as a result of such statements, I'm currently working
    on speed optimization, instead of SSE2 and stuff, contrary
    to my theory
    Well, its just a rangecoder optimization for now, so I hope
    that it won't mess up the prediction stage. Btw, I managed
    to reduce the number of rc calls more than in half:
    http://shelwien.googlepages.com/p2.txt
    Well, somehow it didn't improve the speed right away, but
    that's probably because of some compiler weirdness

    It was like this:
    Code:
    if( DECODE ) {
      dbit = rc.BProcess<1,1>( p0 );
      if( dbit ) { UPDATE(1) } else { UPDATE(0) }
    }
    and now its like this:
    Code:
    if( DECODE ) {
      dbitx = (p0>=hSCALE); //p0>>(SCALElog-1);
      int pr = p0 + ((SCALE-p0-p0)&(-dbitx));
      if( pr<M_p0lim+M_p1lim ) {
        if( pr<M_p0lim ) {
          if( b0count<0 ) dec_dist( rc, b0count, 0, M_ex0wr, M_ex0mw );
          dbit = (b0count>0); b0count--;
        } else {
          if( b1count<0 ) dec_dist( rc, b1count, 1, M_ex1wr, M_ex1mw );
          dbit = (b1count>0); b1count--;
        }
      } else {
        dbit = rc.BProcess<1,1>( pr );
      }
      dbit ^= dbitx;
    
      if( dbit ) { UPDATE(1) } else { UPDATE(0) }
    }
    And these counts are up to millions on enwik8, so
    I'm not sure what exactly takes away the time saved
    on rangecoder calls.

    Anyway, seems like I'm going to keep experimenting with
    BWT postcoders too, for a while, as there're many more ideas
    which I'd like to try out .

  30. #60
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,000
    Thanks
    387
    Thanked 365 Times in 145 Posts
    Inspired by Shelwien, started experimenting with the dynamic mixing. Written my own high-precise dynamic mixer... Optimizer just started, will see what will happen...

Page 2 of 3 FirstFirst 123 LastLast

Similar Threads

  1. BCM v0.09 - The ultimate BWT-based file compressor!
    By encode in forum Data Compression
    Replies: 22
    Last Post: 6th March 2016, 10:26
  2. PPMX v0.05 - new PPM-based compressor
    By encode in forum Data Compression
    Replies: 49
    Last Post: 28th July 2010, 03:47
  3. BCM v0.01 - New BWT+CM-based compressor
    By encode in forum Data Compression
    Replies: 81
    Last Post: 9th February 2009, 16:47
  4. Blizzard - Fast BWT file compressor!!!
    By LovePimple in forum Data Compression
    Replies: 40
    Last Post: 6th July 2008, 15:48
  5. DARK - a new BWT-based command-line archiver
    By encode in forum Forum Archive
    Replies: 138
    Last Post: 23rd September 2006, 22:42

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •