Results 1 to 1 of 1

Thread: experiments with a new idea for MTF with prediction

  1. #1
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    140
    Thanks
    26
    Thanked 94 Times in 31 Posts

    experiments with a new idea for MTF with prediction

    So I recently put together a version of M99 that was tuned specifically for binary alphabets. Dealing with only two symbols provides lots of opportunities for performance gains (in terms of speed, not compression). And this was definitely the case for binary M99 which can encode/decode at hundreds of MB/sec single threaded. This is much faster than the standard version but the compression numbers sucked horribly so I started playing around with some ideas to transform the data before encoding. Normally I wouldn't endorse the idea of using any post BWT transforms like MTF, RLE etc. They only hurt compression in general. But binary m99 certainly needs some help in this regard so I put together a bitwise MTF idea that I had been kicking around for some time and figured that I would share it.

    The idea is MTF combined with prediction and it works like this (pseduo code):

    1: value_to_encode = input_byte ^ previous_input_byte
    2: starting with LSB, pack 0 bit if value_to_encode bit is zero. Continue while all bits are zero or all bits packed.
    3a: if not all bits packed then value_to_encode = input_byte ^ previous_previous_input_byte (two bytes prior). Go back to 2 and continue encoding remaining bits.
    3b. if bits still remain to pack after switching to previous_previous_input_byte then value_to_encode = history[bits_already_packed] go back to 2 and continue encoding remaining bits
    4. update history[bits_actually_packed] = input_byte

    So the concept is to encode against the previous symbol and if not an exact match encode against the symbol prevoius to the current previous symbol.
    And if both of these previous symbols do not exactly match the current byte to encode then switch to encoding against the last symbol we encountered (a char [256]) which matched the pattern of the bits
    that we have already encoded using the two previous symbols. Then update our history array (char [256]) to include the symbol we just encoded using the bits that were already encoded at that time.

    After this phase I added a second transform which reordered the bits of the input from [01234567][01234567][01234567] to [000111222333444555666777] and encode this value with bitwise m99.
    The results aren't *great* yet but I think the basic idea of MTF with predictions is interesting enough on its own to merit mentioning.

    The M99 binary coder (super fast) and the MTF-P code is available at:

    https://github.com/michaelmaniscalco/m99_binary.git


    Some numbers on the throughput of M99_binary (the output breaks out the encoder/decoder speed explicitly as I was mostly focusing on how fast that part can be).

    Code:
    ./build/release/bin/m99_binary enwik8
    m99_binary - experimental ideas - m99 for binary alphabets.  Author: M.A. Maniscalco (1999 - 2019)
    compressed: 100000000 -> 31322961 bytes.  ratio = 31.323%
    M99 elapsed encode time: 0.483012 seconds : 207.034 MBytes/sec
    Overall elapsed time: 4.40366 seconds : 22.7084 MBytes/sec
    M99 elapsed decode time: 0.56565 seconds : 176.788 MBytes/sec
    Overall elapsed time: 3.85271 seconds : 25.9557 MBytes/sec
    Decode verified
    Test machine specs:
    Code:
    Vendor ID:           GenuineIntel
    CPU family:          6
    Model:               158
    Model name:          Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
    Stepping:            10
    CPU MHz:             855.564
    CPU max MHz:         4100.0000
    CPU min MHz:         800.0000
    Sorry if the description of MTF-P is unclear. It's pretty late here in Boston.
    - Michael

  2. Thanks (3):

    Bulat Ziganshin (25th April 2019),Mike (25th April 2019),RichSelian (26th April 2019)

Similar Threads

  1. C++ constexpr experiments
    By Shelwien in forum The Off-Topic Lounge
    Replies: 1
    Last Post: 25th February 2019, 18:04
  2. Adaptive Lossless Prediction impl in C++
    By mo_ in forum Data Compression
    Replies: 1
    Last Post: 26th January 2017, 05:49
  3. Experiments with orderN-CM
    By Sebastian in forum Data Compression
    Replies: 91
    Last Post: 11th January 2012, 00:09
  4. Experiments with small dictionary coding
    By Matt Mahoney in forum Data Compression
    Replies: 13
    Last Post: 28th June 2010, 16:34
  5. Ordered bitcodes experiments
    By Shelwien in forum Data Compression
    Replies: 19
    Last Post: 30th May 2009, 03:45

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •