Results 1 to 15 of 15

Thread: Modelling techniques

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,053 Times in 565 Posts

    Modelling techniques

    Any idea on restructuring this or adding something?

    1. Basic statistics
    1.1. CM (context modelling) with simple incremental counters (frequencies) -
    despite common beliefs it can be faster than working with probabilities.
    1.2. CM with shift-based probability counters (fpaq0-like)
    1.3. combinatorical multiplication-based probability counters (frequency simulation)
    1.4. state machine counters
    1.5. delayed counters
    1.6. delayed counters with context interpolation and indirect update
    1.7. (=2.5) +context quantization
    2. Secondary estimation
    2.1. Using a quantized probability in context
    2.2. Nonlinear probability mapping
    2.3. +Interpolation
    2.4. +Indirect update
    3. Prediction merging techniques
    3.1. Switching (eg. by amortized code length)
    3.2. Static linear mixing
    3.3. Adaptive linear mixing
    3.4. +Indirect update
    3.5. Multi-dimensional version of "secondary estimation"
    3.6. Update back-propagation
    4. Precision control
    4.1. Using interval arithmetics to calculate the prediction error (and make a correction)
    4.2. Adaptive correction by using the prediction error in contexts
    5. Parameter optimization techniques
    5.0. Manual
    5.1. Simple bruteforce
    5.2. CM-driven bruteforce (using correlations between parameter set and output size)
    5.3. Bayesian (likelihood-based) (eg. by polynomial approximation)
    6. Symmetry control
    6.1. Blockwise redundancy check
    6.2. Statistical segmentation
    6.3. Adaptive model optimization and parameter values encoding
    7. Applied modelling
    7.1. Hash function design
    7.2. Speed optimization
    7.2.1. Alphabet decomposition by Huffman's algorithm
    7.2.2. Faster processing of probable cases in general
    7.3. Serial improvement (adaptively using a secondary model to determine the design of primary model,
    eg. symbol ranking)
    7.4. Speculative processing (mostly relevant for decoding and threaded implementations)

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,053 Times in 565 Posts
    Btw, I don't have any foundations but feel that "modeling" is something different from "modelling".
    Like http://www.start-modeling.com
    Also compare http://www.google.com/search?q=modeling and http://www.google.com/search?q=modelling
    Last edited by Shelwien; 30th May 2008 at 19:49.

  3. #3
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    It would be great to provide some code examples for us. For example:
    1.2. CM with shift-based probability counters (fpaq0-like)
    pr+=((y<<12)-pr)>>5;

    ...

    Just to create a small CM FAQ/glossary and intro!

  4. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,053 Times in 565 Posts
    You know, it'd take at least a page of description text for each technique
    and a link to ready-to-use coder as example to be of some use.

    Though in theory I have an idea about making a site dedicated to CM,
    but i want my own blog engine for it and being lazy to write it.

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts

    Wink

    For educational purposes would be better if we will provide one or two demo encoders with different techniques - just a different Predictor class or something. FPAQ0*, FCM1, M01 are good examples.
    Also it would be super-cool if Chris will release his own small/demo open source CM encoder!
    I'm still very interested in his CM/FSM-counters. I'm pretty sure that some day he will do that, or at least his ideas will be available to public! For example LZTURBO was already disassembled...

  6. #6
    Member
    Join Date
    May 2008
    Location
    brazil
    Posts
    163
    Thanks
    0
    Thanked 3 Times in 3 Posts
    Make a wiki is very good!

  7. #7
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Quote Originally Posted by encode View Post
    For educational purposes would be better if we will provide one or two demo encoders with different techniques - just a different Predictor class or something. FPAQ0*, FCM1, M01 are good examples.
    I think that's an excellent idea!

  8. #8
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    What about different alphabet decompositions? You mentioned huffman, but what about unary coding (by symbol rank)?
    A long time ago i tried a fixed decomposition in buckets of length 4, 8 and 12 (nibble aligned). Not as good as huffman, but the results were better than ordinary flat decomposition (expect "random" data).

    CM/FSM-counters
    What do you mean here?

  9. #9
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    Quote Originally Posted by toffer View Post
    What do you mean here?
    1.4. state machine counters

  10. #10
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by encode View Post
    1.4. state machine counters
    Ok, got it. What about adding these decomposition methods?

  11. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,053 Times in 565 Posts
    I intended to write a list of techniques with increasing "advancement".
    Of course its not complete, and its true that the only example of
    alphabet decomposition there is 7.2.1, state machine counters is a
    completely unrelated topic.

    Also its nice that somebody finally said something relevant,
    but I'd like to see some really different ideas from already mentioned.
    There're some for sure.

  12. #12
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Well, if you mention huffman decomposition you should also mention unary and flat dec. Huffman decomposition was already used within some CTW implementations; but isn't unary decomposition a bit more advanced?

    Why should state machine counters be unrelated? They were introduced with PAQ (not that old) and do a nice job at higher orders, since similar contexts are merged - and the PAQ implementation fits them into bytes, albeit they're constructed heuristically (not based on empirical data).

  13. #13
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,053 Times in 565 Posts
    > Well, if you mention huffman decomposition you should also mention unary and flat dec.

    Its 7.2.1, an applied modelling example. There's also 7.2.2 which could cover unary etc.
    But its about optimizations (mostly speed) by applying modelling to compressor itself.

    And alphabet decomposition (along with whole coding) just isn't covered in the list.
    Maybe you're right and we should make it a separate category somewhere between
    (1) and (2).
    Though imho coding techniques deserve another list.

    > Huffman decomposition was already used within some CTW implementations;
    > but isn't unary decomposition a bit more advanced?

    Its not as its simpler to implement.
    Especially if its blockwise-static huffman tree with which we need to remap
    statistics from one decomposition to another.

    > Why should state machine counters be unrelated?

    I meant that they're unrelated to unary coding and decomposition at whole.
    Generally, you can use any decomposition with any counters.

    > They were introduced with PAQ (not that old)

    In public, maybe. I think its a common technique in communication/media industry.
    Eg. I doubt that it was adopted into h264 from paq.

    > PAQ implementation fits them into bytes, albeit they're

    ppmonstr isn't that worse with byte counters too, but without
    any quantization and lookups.

    Also I was only impressed with rnd-driven updates, when paq appeared.
    While other things seemed common sense.

  14. #14
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,985
    Thanks
    377
    Thanked 353 Times in 141 Posts
    Quote Originally Posted by Shelwien View Post
    Btw, I don't have any foundations but feel that "modeling" is something different from "modelling".
    Like http://www.start-modeling.com
    Also compare http://www.google.com/search?q=modeling and http://www.google.com/search?q=modelling
    In Matt's paper Adaptive Weighing of Context Models for Lossless Data Compression we may find the Online Modeling term. So I beleive that Modeling is correct word.

  15. #15
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,053 Times in 565 Posts
    They're both correct, as you can see in any dictionary.
    But google shows how they're currently used.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •