Results 1 to 3 of 3

Thread: Data Structures for Context Model Statistics

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,366
    Thanks
    212
    Thanked 1,018 Times in 540 Posts

    Data Structures for Context Model Statistics


  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,366
    Thanks
    212
    Thanked 1,018 Times in 540 Posts
    I quoted the last part of icq log with links into post
    comments on ctxmodel but somehow it becomes much less
    attractive to me like this.
    I mean, I was interested in sequential modelling of data,
    not tree-based because the hierarchical processing blocks
    the possibilities for syntax parsing and using the
    alternative submodels.
    So guess I'd go for the 2-stage architecture with context
    maps + small hash, or even a tree.
    Anyway, i just need a data structure which would allow me
    to keep the bytewise statistics and would have a
    reasonable workaround for memory overflows.
    A hash is certainly not that.
    And making a good tree would be too complicated, and not
    really efficient, as it'd have to contain both occurence
    counters and probability counters (to provide a way to
    remove the contexts going out of the window).
    And then there's kinda nothing else.
    Though well, also string search algorithms are applicable
    if its a small block.
    I mean, with a block size like 1k it wouldn't be that slow
    to just search for higher order contexts from the block
    start.

  3. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,366
    Thanks
    212
    Thanked 1,018 Times in 540 Posts
    There was also http://ctxmodel.net/rem.pl?-32
    and http://shelwien.googlepages.com/stat3.txt

    And today I made a dumb but working coder implementation
    http://ctxmodel.net/files/tree_v1.rar
    Its an order16 bytewise coder with full context statistics
    (precise number of context's occurences in the already
    processed data) and single rangecoder call per byte,
    which is rare these days.

    Alas, for now its just another remake of the same scheme
    like used in ppmy/ash.

    + Its much more portable and doesn't use floats
    + Tree implementation is compatible with 64-bit systems
    (should be, at least)
    + 215.7k on book1 with a very primitive mixing implementation

    - No tree overflow handling. However, with precise counters
    its possible to implement a sliding window.
    - Even simple parsing (collection of statistics) is considerably
    slow. It can be somewhat improved by using the native
    32-bit pointers for the tree, and other optimizations are
    possible, but something very significant has to be changed
    to make it practical... like using a faster secondary
    structure for global statistics.

    But still, its something different from paq and provides access
    to statistics like number of symbols in context and their
    probability distribution, which are not easy to acquire with
    bitwise hashes, but are quite useful for text compression.
    So at least its applicable for hutter challenge, despite its
    poor speed .

Similar Threads

  1. Simple bytewise context mixing demo
    By Shelwien in forum Data Compression
    Replies: 11
    Last Post: 27th January 2010, 04:12
  2. Context mixing
    By Cyan in forum Data Compression
    Replies: 7
    Last Post: 4th December 2009, 19:12
  3. UIQ2 statistics
    By schnaader in forum Data Compression
    Replies: 5
    Last Post: 17th September 2009, 22:36
  4. M01 - Order 01 context mixer
    By toffer in forum Data Compression
    Replies: 58
    Last Post: 17th June 2008, 19:29
  5. CMM fast context mixing compressor
    By toffer in forum Forum Archive
    Replies: 171
    Last Post: 24th April 2008, 14:57

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •