Results 1 to 23 of 23

Thread: Index-Compress-Update: parallel LZ match finder algo

Threaded View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    802
    Thanked 698 Times in 378 Posts

    Index-Compress-Update: parallel LZ match finder algo

    The main problem with parallel impelementation of LZ match searching is that we need different match indices for different blocks processed simultaneously. I propose to solve this problem by saving small "incremental" search indices. For example, consider compression of 64 kb block with 8 mb dictionary in the follwoing steps:

    1. we split 64 kb into four 16 kb subblocks and index them all
    2. we compress all subblocks simultaneously, searching for matches both in index of previous 8 mb and in indices of subblocks preceding current one
    3. we update main 8mb index with small indices of subblocks

    of course, real algo should use sliding dictionary. every subblock should be processed in 3 steps:
    1. build its own small index
    2. compress using large index built N blocks before and small indices of previous N blocks
    3. update large index with our small index data once all previous subblocks are finished compressing

    This algo requires to use N small indices to search matches. Alternatively, we can use one small index that holds last 64 kb. It should be built in every subblock starting with copy of index of previous subblock. Then build process should remove obsolete data for the oldest 16 kb and add data for the new 16 kb


    The third step, Update, should be done when all threads performing Compress step are locked (unless we have index that can't be broken during update and we are able to skip duplicate matches found by small and large index). Update procedure may be multithreaded itself


    PS: I've seen the process of multiplication of hash indices 10 years ago when we switched from using single index on 3 bytes to 3+4 and likewise schemes. It would be fanny to see its again, now due to reqs of multicore cpus. World is still waiting for real multi-core optimized software, including LZ compression algos
    Last edited by Bulat Ziganshin; 13th August 2011 at 13:14.

Similar Threads

  1. which tools support parallel compress
    By l1t in forum Data Compression
    Replies: 10
    Last Post: 9th January 2012, 22:35
  2. Replies: 7
    Last Post: 19th March 2011, 11:50
  3. zpaq 1.02 update
    By Matt Mahoney in forum Data Compression
    Replies: 11
    Last Post: 10th July 2009, 01:55
  4. Ocamyd Update!
    By LovePimple in forum Forum Archive
    Replies: 2
    Last Post: 29th March 2008, 23:28
  5. CM Match model
    By toffer in forum Forum Archive
    Replies: 32
    Last Post: 1st February 2008, 20:26

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •