Results 1 to 2 of 2

Thread: More ideas for LZ matchfinding

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,267
    Thanks
    200
    Thanked 985 Times in 511 Posts

    More ideas for LZ matchfinding

    1. BWT suffix arrays

    cbloom seems happy about it, but he keeps talking about 128k windows,
    and at that level plain hash chains is imho the best solution.

    Anyway, I've got disappointed after noticing the fact that computing
    BWT of the whole file is not a solution, and to link up SA's of smaller
    blocks we'd need additional pointers, so it would end up as 8N.

    Btw, I don't really understand cblooms ideas, but my approach is
    to compute BWT of overlapped blocks
    (0..2*winsize, winsize..3*winsize, ...), so that after reaching
    offset 2*winsize, we'd compute the BWT of winsize..3*winsize block.
    Which would work and allow to find matches in a window, but its still 8N.

    2. Refinement of NMF ( http://encode.su/threads/1278-Nibble-MMC-matchfinder )

    Basically its just a real tree, unlike MMC where collisions are allowed,

    Merging the text buffer with the tree appeared slow, also afair Cyan(?) posted
    good ideas about nibble/byte symbol switching depending on alphabet size, also
    lazy insertion (add strings to hash chain first, only sort on lookups).

    But at best this would be more stable than bt4 in redundant cases, however
    we can't expect any significant speed difference on normal files.

    3. Suffix trees

    Suffix pointers are really nice, and there's the idea about bitmask compression
    of tree nodes ( http://encode.su/threads/1173-New-da...for-bitwise-CM )
    but managing memory allocation gets very hard, and overall consumption is much higher
    than 8N, so is there really a reason to implement this?
    (for LZ; for CM/PPM there're less alternatives, because we have to store stats somewhere)

    Another problem is window maintenance - surely its possible to delete nodes at
    the other end of the window, but that requires adding precise context counts to nodes
    (memory!) and makes parsing 2x slower.
    So working with overlapping double windows seems like a more convenient solution
    here too, but then BWT really seems nicer.

    4. Compressed match history

    LZMA matchfinder finds the nearest substring occurrence per length, so
    if we'd simply dump its output, we'd get something like this:

    Code:
    offset: match offset array
    1F01: 
    1F02: 1E34 
    1F03: 1E35 1E35 
    1F04: 1E36 1E36 1E36 
    1F05: 1E37 1E37 1E37 1E37 
    1F06: 1E38 1E38 1E38 1E38 1E38 
    1F07: 1E39 1E39 1E39 1E39 1E39 1E39 
    1F08: 1E3A 1E3A 1E3A 1E3A 1E3A 1E3A 1E3A 
    1F09: 1E3B 1E3B 1E3B 1E3B 1E3B 1E3B 1E3B 1E3B 
    1F0A: 1E3C 1E3C 1E3C 1E3C 1E3C 1E3C 1E3C 1E3C 1E3C 
    1F0B: 1E3D 1E3D 1E3D 1E3D 1E3D 1E3D 1E3D 1E3D 1E3D 1E3D 
    1F0C: 1E55 1E3E 1E3E 1E3E 1E3E 1E3E 1E3E 1E3E 1E3E 1E3E 1E3E 
    1F0D: 1EC5 09E4 
    [...]
    2700: 267B 
    2701: 25BE 
    2702: 1F05 1F05 
    2703: 2649 1F06 1F06 
    2704: 1F07 1F07 1F07 1F07 
    2705: 1F08 1F08 1F08 1F08 1F08 
    2706: 1F09 1F09 1F09 1F09 1F09 1F09 
    2707: 1F0A 1F0A 1F0A 1F0A 1F0A 1F0A 1F0A 
    2708: 1F0B 1F0B 1F0B 1F0B 1F0B 1F0B 1F0B 1F0B 
    2709: 2600 16B9 16B9 16B9 16B9 
    270A: 2601 2601 16BA 16BA 16BA 16BA 
    270B: 2602 2602 2602 16BB 16BB 16BB 16BB 
    270C: 1C52 1C52 1C52 
    270D: 265A 1C53 1C53 1C53 
    270E: 2552 23F9 1C54 1C54 1C54 
    270F: 2519 2519 23FA 1C55 1C55 1C55
    The idea is to directly compress (dedup) this table and link it up.
    One benefit is that such a matchfinder would be able to provide to
    main engine the (compressed) distance arrays for whole blocks
    (while working in its own thread).

    5. Direct hashing and bloom filters ( http://en.wikipedia.org/wiki/Bloom_filter )

    Recently I tested the uncached RAM read speed after a long while,
    and was really impressed by the results (600-700 clocks).
    Btw, 600 clocks at 3.52Ghz is ~170ns, which is actually not that much,
    taking into account page directory lookups on TLB miss.

    Anyway, so basically it doesn't matter how much computing we do,
    while we don't access any uncached memory.

    Thus the following idea: just implement direct hashtable lookups
    not only for o2/o3/o4, but all the way up to, like, o12.
    But plain direct lookups would be obviously slow, so here's the
    quirk: use small bloom filter hashtables to exclude missing contexts,
    which would be frequent for higher orders.

  2. #2
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,498
    Thanks
    741
    Thanked 664 Times in 358 Posts
    the lasy [ara is interesting idea. i used sort of it to make srep faster - checking a 1-bit "present?" flag before actual hash access. but it's problematic to use it with rolling window (srep just stores flags for entire file)

Similar Threads

  1. Ideas for PIM
    By Mapi in forum Forum Archive
    Replies: 1
    Last Post: 18th June 2007, 22:24

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •