View Poll Results: BCM v0.09 must be like:

Voters
26. You may not vote on this poll
  • Simplified v0.08. Notable faster, less compression.

    4 15.38%
  • Optimized v0.08. About the same speed, a small compression gain.

    3 11.54%
  • Enhanced v0.08+an extra SSE. Moderate speed penalty, nice compression gain.

    4 15.38%
  • Enhanced v0.08+two extra SSEs. Some speed penalty, really nice compression gain

    13 50.00%
  • Something else. Post your variant.

    2 7.69%
Results 1 to 18 of 18

Thread: BCM's future

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,010
    Thanks
    399
    Thanked 398 Times in 152 Posts

    Exclamation BCM's future

    A small vote about the future of the BCM.

  2. #2
    Member
    Join Date
    May 2009
    Location
    China
    Posts
    36
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I choose the first

  3. #3
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    688
    Thanks
    41
    Thanked 173 Times in 88 Posts
    OK. Since I voted for "Something else" I'm posting my variant. The suggestion is to make a selectable compression level so new BCM will cover all above speed\ratio tradeoffs so both speed fans and compression maniacs will be happy
    Anyway, if its not an option then I choose last variant - Enhanced v0.08+two extra SSEs. Some speed penalty, really nice compression gain.

  4. #4
    Member
    Join Date
    Jun 2008
    Location
    USA
    Posts
    111
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Well, we already have fastest ever (LZOP) and slowest ever (PAQ, so anything in between is fine, honestly.

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,010
    Thanks
    399
    Thanked 398 Times in 152 Posts

    Wink

    Today, I just tested all of the modifications. I was curious how many times the fastest version is faster compared to the slowest one. Well, it's less than two times. However, in terms of complexity, the slowest version is far more complex, I'd say it is 10 times more complex than the fastest. Anyway, the fastest can be even faster&simpler, but I see no reason to use a dummy coder inside the BCM. Still, the slowest version is a few times faster than BWTmix or BBB... I'm not talking about the latest BWMonstr...
    Yep, I'm thinking about a selectable CM coder, say '-x' option will select the strongest one. Anyway, I did the straight compare of the fastest and the slowest one to see how it feels like. Well, at the regular user point of view relatively small compression gain not worth such extra processing time. 210xxx/209xxx bytes vs 208xxx bytes on book1, ~200k difference on ENWIK8 not plays the serious role at some point of view. But, since the compression/decompression speed of the slowest version is still quite acceptable, I think it's OK indeed!

  6. #6
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    688
    Thanks
    41
    Thanked 173 Times in 88 Posts
    Quote Originally Posted by encode View Post
    I'm not talking about the latest BWMonstr...
    Aha, Sami silently released BWMonstr v0.02. Thanks for news! Another thing to test

  7. #7
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    409
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Why don't you give the information everyone wants to read

    Code:
    
    0.02, July 7, 2009
    ------------------
    This version implements a "compressed model" in which the data is kept
    compressed in memory all the time.
    
    Compressed model program flow:
    
      Compression: compression -> bwt -> compression
      Decompression: decompression -> compression -> unbwt -> decompression
    
    BWMonstr is able to perform BWT compression and decompression using about
    0.5n space for English text. This is 10% of the amount that typical BWT
    implementations use and around 3% - 5% that of PPM or CM implementations.
    
    The program supports multi-threading. In practice the speedup for
    compression translates in the following way:
    
      2 processors: 1.57x
      3 processors: 1.83x
      4 processors: 2.25x

  8. #8
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 795 Times in 488 Posts
    Actually BWMonstr 0.02 has been released for awhile.
    http://mattmahoney.net/dc/text.html#1605

    Quite impressive. Uses less memory than the block size. Uses all cores in parallel on a single block with no loss of compression ratio. Makes the Pareto frontier on size/memory. Unfortunately it is slower than paq8px on a single core

    paq8k2 is still the slowest, however. enwik9 would take months.

  9. #9
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 795 Times in 488 Posts
    Anyway, about the vote, what I'd like to see is good speed and compatibility between versions, kind of like zip.

  10. #10
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,909
    Thanks
    291
    Thanked 1,271 Times in 718 Posts
    Why don't you merge bbb and zpaq then?
    Last edited by Shelwien; 31st July 2009 at 02:01.

  11. #11
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 795 Times in 488 Posts
    Yeah, I've been meaning to write a BWT or LZP+BWT based compressor in ZPAQ. The inverse transform should not be too hard to write in ZPAQL. (I already did LZP). But for now I think I will work on a .bmp compressor first.

  12. #12
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,564
    Thanks
    773
    Thanked 687 Times in 372 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Uses less memory than the block size
    it's probably dictionary transformation?

  13. #13
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    it's probably dictionary transformation?
    Not exactly, it uses same approach as Shelwien does. It first compresses the input, and then sorts contexts based on these compressed input.
    BIT Archiver homepage: www.osmanturan.com

  14. #14
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Yeah - exactly: http://encode.su/forum/showthread.php?t=379

    I'd vote for max. compression. Maybe you can use 2d SSE instead of multiple chained SSE stages?

  15. #15
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    919
    Thanks
    57
    Thanked 113 Times in 90 Posts
    is it possible to have better compression without decompresse speed penalty. but only at the cost of compression time ?

    i really dont care about compressions speed. i cna compresss while i do other stuf.
    but when i decompress I'm waiting for the data and then times becomes important.


    just my thought

  16. #16
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,909
    Thanks
    291
    Thanked 1,271 Times in 718 Posts
    Yeah, its possible to perform a limited optimization for blocks of data
    during compression, and just store the parameters for decompression.
    Alphabet reordering and dynamic dictionary are examples of that too,
    but actually I meant something like trying multiple models and selecting
    the best one.
    However, I'd be really surprised if encode would ever implement something like that

  17. #17
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    688
    Thanks
    41
    Thanked 173 Times in 88 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Actually BWMonstr 0.02 has been released for awhile.
    http://mattmahoney.net/dc/text.html#1605

    Quite impressive. Uses less memory than the block size. Uses all cores in parallel on a single block with no loss of compression ratio. Makes the Pareto frontier on size/memory. Unfortunately it is slower than paq8px on a single core

    paq8k2 is still the slowest, however. enwik9 would take months.
    Well, seems that BWMonstr v0.02 already have been tested but since I've done some tests for myself, I think there is nothing bad if I'll publish them

    version \ size in bytes \ comp. time \ dec. time
    Code:
    BOOK1
    0.01 = 205 397 =  10.575 = 10.025
    0.02 = 204 844 = 108.833 = 50.646
    
    ENWIK6
    0.01 = 245 958 =  17.140 =  17.032
    0.02 = 244 590 = 181.607 = 105.320
    
    ENWIK8
    20 379 365 =  1726.326 =  1675.510
    20 307 295 = 17908.286 = 10406.475
    Compression ratio improvement is very modest while v0.02 is more than 10 times slower! Well, the good thing here is that v0.02 shows up more asymmetrity. Also some strange thing happens during compression. The output file is slowly growing and then at some point its size is reseting and starts growing again but faster.
    Maybe from some technical point BWMonstr v0.02 is unique but for me it looks a little bit strange. For example:
    Code:
    ENWIK6
            0.01 = 245 958 =  17.140 =  17.032
            0.02 = 244 590 = 181.607 = 105.320
    paq8px_61 -1 = 230 097 =  14.250

  18. #18
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,909
    Thanks
    291
    Thanked 1,271 Times in 718 Posts
    you can use text preprocessing with bwmonstr too... like wrt or drt.
    i mean, paq8px isn't a plain universal context model.

Similar Threads

  1. BCM v0.10 is here!
    By encode in forum Data Compression
    Replies: 45
    Last Post: 20th June 2010, 21:39
  2. BCM v0.06,0.07 is here! [!]
    By encode in forum Data Compression
    Replies: 34
    Last Post: 31st May 2009, 16:39
  3. BCM v0.05 is here! [!]
    By encode in forum Data Compression
    Replies: 19
    Last Post: 8th March 2009, 21:12
  4. Future Bandwidth.
    By Tribune in forum The Off-Topic Lounge
    Replies: 9
    Last Post: 10th October 2008, 22:56
  5. LZPM's future
    By encode in forum Forum Archive
    Replies: 129
    Last Post: 3rd March 2008, 19:23

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •