Results 1 to 2 of 2

Thread: MTF and Coroutines and Codec APIs

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,293 Times in 734 Posts

    MTF and Coroutines and Codec APIs


    MTF is a data transform, a simple case of symbol ranking.
    It was used in old-style BWT postcoders, including bzip2.
    Its complexity is not exactly low though, so its rarely
    used as is, if only as an example.
    But symbol ranking (including MTF) is quite relevant
    for bytewise statistical compressors.
    For example, PPMd implements an approximation of
    frequency sorting, and Ash specifically uses MTF.

    gMTF here (from "generalized MTF") is a homemade
    modification which assigns a rank to a symbol using
    a function of distance to its last occurence.
    Somehow, it provides better results after entropy
    coding than normal MTF and a couple of other tested


    There's a common problem with availability of a convenient API
    to a compression library with entropy coding.
    Its somewhat easier for decompression, because we
    can split the data into reasonable blocks and provide
    a way to acquire the buffer size for unpacked data
    before actual decoding.
    But sequential compression API commonly works with
    some kind of "streams" which is far from convenient
    (requires lots of preparations, sometimes including
    writing special callback functions etc)
    and also far from efficient too.

    So, imho, the best interface for a compression
    library would be something like
    int Compress( byte* inp,uint inplen, byte* out,uint outlen );
    with assumption that it just returns a corresponding status code
    when any buffer is finished, and can resume from any point.
    Its easily compatible with non-C languages, and ideally can
    be optimized as far as possible.
    Also, multiple components with such interface can be
    easily stacked together without sacrificing any speed.

    Well, the only problem is how to handle a "sudden"
    buffer overflow somewhere deep in the nested functions.
    And - which is even harder - how to return back and
    resume from the same point after getting another buffer.

    And coroutines is a concept that solves exactly that problem.

    3. Sources
    - Plain implementation of MTF and gMTF
    - Coroutine implementation

    mtf c input output -- forward MTF
    mtf d output input2 -- inverse MTF
    mtf C input output -- forward gMTF (generalized MTF)
    mtf D output input2 -- inverse gMTF
    Last edited by Shelwien; 23rd January 2010 at 20:58.

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,293 Times in 734 Posts

Similar Threads

  1. Lossless Audio Codec
    By encode in forum Forum Archive
    Replies: 8
    Last Post: 1st August 2007, 18:36

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts