Results 1 to 3 of 3

Thread: Context quantization and CM asymmetry

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,372
    Thanks
    213
    Thanked 1,020 Times in 541 Posts

    Context quantization and CM asymmetry

    http://cbloomrants.blogspot.com/2010...mpression.html

    > In these sensitive parts of the coder, you
    > obviously want to use as much context as
    > possible, but if you use too much your
    > statistics become too sparse and you start
    > making big coding mistakes.

    In other words, the context quantization function
    is a lossy compression function, but unlike
    lossy media coders, we have a well defined
    quality metric here (entropy).

    > For example something that hasn't been explored
    > very much in general in text compression is
    > severely assymetric coders.

    I guess you just missed it completely.
    In text compression, there're preprocessors, like
    http://www.ii.uni.wroc.pl/~inikep/
    And in special cases, like Hutter Prize, people
    put a lot of work into selection and arrangement
    of words in the dictionary.

    Also there're quite a few projects with
    parameter optimization pass.
    For example, see
    http://sites.google.com/site/toffer86/m1-project
    There's a "o" processing mode which builds a
    model profile for given data samples (which
    includes context masks and counter update
    constants and the like).

    There're also many other projects with a similar
    approach (eg. epmopt and beeopt), and most of my
    coders are made like that, just that it makes
    more sense to use in the development stage than
    in public utilities.

  2. #2
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi,

    first of all i want to welcome you too, nice to see some "old-school" guys appearing again.

    For example an obvious one is : in each context we generally store only something like the number of each character that has occurred. We might do something like scale the counts so that more recent characters count more. eg. you effictively do {all counts} *= 0.9 and then add in the count of the new character as ++.
    In binary modeling this approach is already used, since it can be implemented efficiently. It can be derived via the minimization of exponentially weighted coding cost. See the attachment (it's incomplete, non- but explains exactly this).

    My experience in the area of parameter optimization of CM coders clearly confirms the observation of contexts being the most sensitive parameter of a CM algorithm.

    EDIT: Don't get it wrong - model contexts to create initial statistics to mix, not SSE or mixer contexts.

    Using plain (bit) counters under some context (instead of FSM counters) introduces great sensitivity in their weighting parameters, too. On the other hand it's rather surprising, that an optimization pass doesn't start at something like 8bpc (randomly assigned initial parameters) , but like 0.5 - 1 bpc worse compression than optimized parameters. Of course such a statement can hardly be generalized, it depends on the given model structure.
    Attached Files Attached Files
    Last edited by toffer; 15th September 2010 at 17:59.

  3. #3
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Shelwien View Post
    There're also many other projects with a similar
    approach (eg. epmopt and beeopt), and most of my
    coders are made like that, just that it makes
    more sense to use in the development stage than
    in public utilities.
    Seems you have forgotten BIT As you remember, I've used MATLAB Optimization Toolbox for optimizing model parameters. But, I have to admit Toffer's job (GA+RHC) Because, it's more efficient. Because, as a final stage I need some random bit flipping to achive more better results after GA phase. Actually it's already known thing about GA (it converges "around" solution space but cannot go further).
    BIT Archiver homepage: www.osmanturan.com

Similar Threads

  1. Simple bytewise context mixing demo
    By Shelwien in forum Data Compression
    Replies: 11
    Last Post: 27th January 2010, 04:12
  2. Context mixing
    By Cyan in forum Data Compression
    Replies: 7
    Last Post: 4th December 2009, 19:12
  3. Data Structures for Context Model Statistics
    By Shelwien in forum Data Compression
    Replies: 2
    Last Post: 8th November 2008, 11:14
  4. M01 - Order 01 context mixer
    By toffer in forum Data Compression
    Replies: 58
    Last Post: 17th June 2008, 19:29
  5. CMM fast context mixing compressor
    By toffer in forum Forum Archive
    Replies: 171
    Last Post: 24th April 2008, 14:57

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •