View Poll Results: Should I release BCM v2.00 (not comparible with v1.xx)?

Voters
29. You may not vote on this poll
  • Yes

    28 96.55%
  • No

    1 3.45%
Page 5 of 5 FirstFirst ... 345
Results 121 to 127 of 127

Thread: BCM - The ultimate BWT-based file compressor

  1. #121
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,897
    Thanks
    291
    Thanked 1,267 Times in 715 Posts
    For BWT there's an option to apply bitwise arithmetic coding to an already compressed bitcode.
    As example, bsc/qlfc achieves 4x bit count reduction already by converting symbols to ranks/RLE.

  2. #122
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    502
    Thanks
    180
    Thanked 177 Times in 120 Posts
    The nature of BWT data having lots of similar symbols adjacent to each other lends itself well to a bytewise arith coder that shuffles the order of symbols based on what's recently been seen. That ought to make the search faster. Either that or SIMD CDF update functions.

    A pure o1 range coder (not ANS) on a 1GB pre-BWT-ed copy of enwik9 I get 180125672 bytes at 57MB/s enc and 52MB/s dec (no threads).
    With built-in RLE+o1 joint coder it's 174962203 bytes at 62MB/s encode and 57MB/s decode.

    Add in LZP and it's probably a little smaller / faster.

    This isn't using any clever CDF stuff. What sort of speed are you wanting for the coder side of things? Static 128Kb blocks with RLE + rANS order-1 gets me 190634138, which is poor, but at ~150MB/s encode speed and 350MB/s decode speed. Not great ratio and static frequencies aren't good for BWT. Maybe some delayed update so quasi-adaptive is a good middle ground.

  3. #123
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,897
    Thanks
    291
    Thanked 1,267 Times in 715 Posts
    I guess it could be interesting to experiment with nibble alphabets if its non-binary.
    Like encode low-nibble+escape + high-nibble-rank only after escape.
    Nibble CDF update can simulate bitwise model and nibble rank update can be done with a single vector instruction.

  4. #124
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,010
    Thanks
    399
    Thanked 398 Times in 152 Posts
    Okay, the FINAL version of the BCMv1 is here:
    https://github.com/encode84/bcm

    It is fully compatible with previous versions.

    My next BWT-based file compressor will be either BCMv2 or a completely new one (might be SQUID - since it's originally based on a fast byte-wise arith coder)


  5. Thanks (2):

    jibz (6th February 2020),Mike (6th February 2020)

  6. #125
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,010
    Thanks
    399
    Thanked 398 Times in 152 Posts
    Quote Originally Posted by JamesB View Post
    What sort of speed are you wanting for the coder side of things?
    Speed = as fast as possible. And the compression MUST be notable higher than the LZMA with 1 GB window - or else this BWT will be pointless.

  7. #126
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,010
    Thanks
    399
    Thanked 398 Times in 152 Posts
    BTW, results of my byte-wise o1 on enwik9.bwt:
    Code:
    Comp. size = 180372557
    Elapsed time = 11.205s
    It's close to 100 MB/sec!

  8. Thanks:

    JamesB (7th February 2020)

  9. #127
    Member
    Join Date
    May 2019
    Location
    Japan
    Posts
    26
    Thanks
    4
    Thanked 8 Times in 4 Posts
    Quote Originally Posted by encode View Post
    Okay, the FINAL version of the BCMv1 is here:
    https://github.com/encode84/bcm
    I'm thinking to test latest BCM for my benchmark [1] (where currently v.1.30 participates). Therefore I have a question (and a feature request).

    Does BCM still only work with files? If so, then I'd like to request adding support for using stdio/stdout (streaming mode) for uncompressed data.

    In my benchmark I use all compressors only in streaming mode (reading from stdin during compression, writing to stdout during decompression). This streaming mode is important for many practical applications, where multiple tools pipe data to each other. I want my results to be relevant for such applications. Therefore, for compressors without such streaming mode, I bolt it on it via a wrapper script. A wrapper decompresses data into temporary file, then streams this file to stdout, and the TOTAL time is measured and compared with other compressors. Naturally, if a compressor can output to stdout by itself, the wrapper and temporary file is not needed and the true speed can be seen.

    [1] http://kirr.dyndns.org/sequence-compression-benchmark/

Page 5 of 5 FirstFirst ... 345

Similar Threads

  1. BCM v0.09 - The ultimate BWT-based file compressor!
    By encode in forum Data Compression
    Replies: 22
    Last Post: 6th March 2016, 09:26
  2. BCM 0.11 - A high performance BWT compressor
    By encode in forum Data Compression
    Replies: 44
    Last Post: 29th October 2010, 22:45
  3. BCM v0.08 - The ultimate BWT-based file compressor!
    By encode in forum Data Compression
    Replies: 78
    Last Post: 12th August 2009, 10:14
  4. BCM v0.01 - New BWT+CM-based compressor
    By encode in forum Data Compression
    Replies: 81
    Last Post: 9th February 2009, 15:47
  5. Blizzard - Fast BWT file compressor!!!
    By LovePimple in forum Data Compression
    Replies: 40
    Last Post: 6th July 2008, 14:48

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •