Results 1 to 7 of 7

Thread: Multi-Thread and compression speed

  1. #1
    Member BetaTester's Avatar
    Join Date
    Dec 2010
    Location
    Brazil
    Posts
    43
    Thanks
    0
    Thanked 3 Times in 3 Posts

    Lightbulb Multi-Thread and compression speed

    I using LZMA2 (7Zip) in multi-thread.

    In 0 - 50% of data compression, 7Zip is fast.
    In 50% - 100% of the data compression, 7Zip is very slow.
    The most of the threads are "idle" after 50% of general process;

    Using Paq8 family, the same problem occurs.
    The processing is very very slow, and only 9% of the total power of process is used.
    The other 91% are not used, are free to play games, etc ...

    Has anyone thought about the question of the threads in the PAQ-family?
    Using 100% of process power, PAQ speed is 10x ?

    Increasing the number of threads (~16), you can allocate more system resources, and gain more speed.

  2. #2
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Search the forum, there have been numerous talks about threading in CM.
    In short: nobody figured how to do it well yet.

  3. #3
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    PAQ is single threaded. ZPAQ uses multi-threading by dividing up the input into blocks and compressing them separately. This loses some compression in exchange for speed. It allocates one thread per core. When a thread has finished compressing a block, it takes the next block that hasn't been started by any other threads. This keeps all of the cores busy until the compression is finished.

  4. #4
    Member Jean-Marie Barone's Avatar
    Join Date
    Oct 2012
    Location
    Versailles, France
    Posts
    60
    Thanks
    15
    Thanked 4 Times in 3 Posts
    I have worked on a version of SMAC where I dedicated a thread to update the Weights in the mixer, in order to gain a few speed.
    It was quite difficult because of Race Condition problems (several threads accessing the same data). This can be avoided using lock prefixes in ASM, or implicit locks like in xchg instruction.

    There was also the False Sharing condition to avoid : "when a thread running on one core tries to read or write data that is currently present in modified state in the first level cache of the other core, this will cause eviction of the modified cache line back into memory and reading it into the first-level cache of the other core".
    This can be solved by padding your shared data, so that a variable will stand in a separate line of cache (128 bytes).

    In the end, I saw no gain of speed compared to my single-threaded version of SMAC. I have concluded that creating several threads sharing same data should be avoided...


  5. #5
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by Jean-Marie Barone View Post

    In the end, I saw no gain of speed compared to my single-threaded version of SMAC. I have concluded that creating several threads sharing same data should be avoided...

    Is that true even if the threads are running on the same core via Hyperthreading? It might not be easy to force them to run on the same core, though.

  6. #6
    Member Jean-Marie Barone's Avatar
    Join Date
    Oct 2012
    Location
    Versailles, France
    Posts
    60
    Thanks
    15
    Thanked 4 Times in 3 Posts
    I made my experiments on my Intel Core 2 Duo Merom : I don't think it supports Hyper-threading. May be the results would be different on a more modern processor. Still, I'm not sure that 2 threads running on the same Core would provide a significant gain of speed compared to the single-threaded version.

  7. #7
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    If it ran at the same speed as the single-threaded version and not any slower on an old cpu, then that could be a good sign that it may run faster on a new cpu. Cache coherency circuitry has been evolving.

Similar Threads

  1. multi-pass compression
    By Cyan in forum Data Compression
    Replies: 4
    Last Post: 4th July 2012, 01:48
  2. PAQ multi thread
    By frede_sch in forum Data Compression
    Replies: 12
    Last Post: 1st November 2011, 02:29
  3. Multi-threaded compression
    By Cyan in forum Data Compression
    Replies: 34
    Last Post: 16th January 2011, 18:32
  4. compression speed VS decomp speed: which is more important?
    By Lone_Wolf236 in forum Data Compression
    Replies: 14
    Last Post: 12th July 2010, 20:57
  5. Compression and speed
    By Wladmir in forum Data Compression
    Replies: 4
    Last Post: 25th April 2010, 13:15

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •