Results 1 to 11 of 11

Thread: deflate decoding benchmark

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,966
    Thanks
    295
    Thanked 1,298 Times in 735 Posts

    deflate decoding benchmark

    I wanted to port libdeflate for my use, but found that it only works memory to memory, while I need stream processing.
    So I decided to optimize reflate's implementation instead.
    But the results I've got kinda look too good, so can somebody check this?
    Maybe I failed at compiling reference implementations, or something?
    But then, libdeflate's gzip.exe is from its release archive.

    Code:
      3.812s:  cf.exe -d <C:\9A49-libdeflate\enwik9.gz >nul  -- cloudflare zlib
      2.813s:  intel.exe -d<C:\9A49-libdeflate\enwik9.gz>nul -- intel zlib
      2.797s:  ng.exe -d <C:\9A49-libdeflate\enwik9.gz >nul  -- zlib-ng
      1.969s:  gzip.exe -dc C:\9A49-libdeflate\enwik9.gz>nul -- libdeflate
    
      1.969s:  gzip.exe -dc C:\9A49-libdeflate\enwik9.gz>nul -- output to nul    
      2.516s:  gzip.exe -dc C:\9A49-libdeflate\enwik9.gz>z:e -- output to ramdisk
      4.828s:  gzip.exe -dc C:\9A49-libdeflate\enwik9.gz>e   -- output to ssd    
    
      2.031s:  gz2unp.exe  C:\9A49-libdeflate\enwik9gz1.gz nul
      1.250s:  gz2unpMT.exe  C:\9A49-libdeflate\enwik9gz1.gz nul
      1.281s:  raw2unp.exe C:\9A49-libdeflate\enwik9gz1.gz z:e
      1.313s:  raw2unp.exe C:\9A49-libdeflate\enwik9gz1.gz e
    http://nishi.dreamhosters.com/u/gz2unp_v0.rar

    Non-MT version can be faster... I'm getting 1.6s or so with gz2unp/AVX512 target.
    And MT version doesn't decode many blocks at once or anything, threads are for huffman decoding and match copy/crc.

    https://github.com/ebiggers/libdeflate
    https://github.com/Dead2/zlib-ng
    https://github.com/cloudflare/zlib
    https://github.com/jtkukunas/zlib (intel)

  2. Thanks (3):

    Bulat Ziganshin (22nd August 2018),JamesB (22nd August 2018),Mike (22nd August 2018)

  3. #2
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    615
    Thanks
    260
    Thanked 242 Times in 121 Posts
    Do you check the CRC? I'd suppose most of the programs only check in a special mode, but it could explain some of the runtime difference if any do. You could corrupt the .gz file by modifying compressed data or the CRC and see which of them give errors.
    http://schnaader.info
    Damn kids. They're all alike.

  4. #3
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,026
    Thanks
    103
    Thanked 410 Times in 285 Posts
    Code:
    gzip 1.2.4 Win32 (02 Dec 97)	5.036 sec., e206c3450ac99950df65bf70ef61a12d *enwik9
    cf				3.672 sec., e206c3450ac99950df65bf70ef61a12d *enwik9
    intel				2.735 sec., e206c3450ac99950df65bf70ef61a12d *enwik9
    ng				2.722 sec., e206c3450ac99950df65bf70ef61a12d *enwik9
    gz2unp				1.898 sec., e206c3450ac99950df65bf70ef61a12d *enwik9
    gzip				1.771 sec., e206c3450ac99950df65bf70ef61a12d *enwik9
    gz2unpMT			1.140 sec., e206c3450ac99950df65bf70ef61a12d *enwik9
    Last edited by Sportman; 22nd August 2018 at 21:57.

  5. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,966
    Thanks
    295
    Thanked 1,298 Times in 735 Posts
    > Do you check the CRC?

    gz2unpMT calculates the crc (it prints calculated one and the one from .gz file), but it doesn't affect the speed, because huffman thread is slower anyway.

    @Sportman:

    Thanks, can you test with some other (large) file?
    Also, I didn't do much for stored block optimization, so there can be interesting results on incompressible files.

    Also, maybe I uploaded the wrong version of libdeflate?
    There're exes in https://github.com/ebiggers/libdefla...x86_64-bin.zip

  6. #5
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,026
    Thanks
    103
    Thanked 410 Times in 285 Posts
    Quote Originally Posted by Shelwien View Post
    Thanks, can you test with some other (large) file?
    Also, I didn't do much for stored block optimization, so there can be interesting results on incompressible files.
    Easy data:
    Code:
    cf		2.327 sec., fe8fbad0f0139d5f1e416a81467facb3 *html.txt
    ng		1.733 sec., fe8fbad0f0139d5f1e416a81467facb3 *html.txt
    intel		1.700 sec., fe8fbad0f0139d5f1e416a81467facb3 *html.txt
    gz2unp		1.234 sec., fe8fbad0f0139d5f1e416a81467facb3 *html.txt
    gzip		1.205 sec., fe8fbad0f0139d5f1e416a81467facb3 *html.txt
    gz2unpMT	0.783 sec., fe8fbad0f0139d5f1e416a81467facb3 *html.txt
    Difficult data:
    Code:
    cf		3.004 sec., c5e6cc45d10960528b44b73495bfde28 *test.bin
    gz2unp		2.557 sec., c5e6cc45d10960528b44b73495bfde28 *test.bin
    ng		2.533 sec., c5e6cc45d10960528b44b73495bfde28 *test.bin
    intel		2.502 sec., c5e6cc45d10960528b44b73495bfde28 *test.bin
    gzip		1.739 sec., c5e6cc45d10960528b44b73495bfde28 *test.bin
    gz2unpMT	1.549 sec., c5e6cc45d10960528b44b73495bfde28 *test.bin

  7. Thanks:

    Shelwien (22nd August 2018)

  8. #6
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by Shelwien View Post
    I wanted to port libdeflate for my use, but found that it only works memory to memory, while I need stream processing.
    So I decided to optimize reflate's implementation instead.
    But the results I've got kinda look too good, so can somebody check this?
    Maybe I failed at compiling reference implementations, or something?
    But then, libdeflate's gzip.exe is from its release archive.
    What is reflate? I thought there was just deflate and inflate in gzip implementations. Which program in the table has your optimization?

    zlib-ng is better than I thought. Is it streaming? Can it be used in nginx?

  9. #7
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,966
    Thanks
    295
    Thanked 1,298 Times in 735 Posts
    > zlib-ng is better than I thought. Is it streaming? Can it be used in nginx?

    It has zlib interface, so yeah.
    Only libdeflate would be hard to use, other implementations are supposed to be transparent zlib replacements.

    > What is reflate?

    Its a deflate recompressor.
    It uses my own deflate library, but reflate version focuses on error detection rather than speed.

  10. #8
    Member
    Join Date
    May 2008
    Location
    HK
    Posts
    160
    Thanks
    4
    Thanked 25 Times in 15 Posts
    what about chromium's zlib?

  11. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,966
    Thanks
    295
    Thanked 1,298 Times in 735 Posts
    You mean this? https://chromium.googlesource.com/ch...rd_party/zlib/
    I compiled it, but it shows 3.078s and seems to be an annoying version of intel patch.
    See https://chromium.googlesource.com/ch...patches/README
    Attached Files Attached Files

  12. #10
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    506
    Thanks
    187
    Thanked 177 Times in 120 Posts
    Quote Originally Posted by Shelwien View Post
    > zlib-ng is better than I thought. Is it streaming? Can it be used in nginx?

    It has zlib interface, so yeah.
    Only libdeflate would be hard to use, other implementations are supposed to be transparent zlib replacements.
    What I find baffling, given zlib-ng was meant to simply be a modernised and optimised zlib drop-in replacement, is how come the standard Linux distributions still stick with the ancient zlib and don't appear to offer alternatives. Zlib is so ubiquitous it could make a significant difference to many applications.

  13. #11
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    404
    Thanked 403 Times in 153 Posts
    Testing results on my i5-9600K:
    Code:
    C:\gz2unp_v0>timer cf.exe -d enwik9.gz
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    
    Kernel Time  =     0.375 = 00:00:00.375 =   9%
    User Time    =     3.453 = 00:00:03.453 =  90%
    Process Time =     3.828 = 00:00:03.828 = 100%
    Global Time  =     3.828 = 00:00:03.828 = 100%
    
    C:\gz2unp_v0>timer intel.exe -d enwik9.gz
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    
    Kernel Time  =     0.468 = 00:00:00.468 =  15%
    User Time    =     2.468 = 00:00:02.468 =  84%
    Process Time =     2.937 = 00:00:02.937 = 100%
    Global Time  =     2.937 = 00:00:02.937 = 100%
    
    C:\gz2unp_v0>timer ng.exe -d enwik9.gz
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    
    Kernel Time  =     0.437 = 00:00:00.437 =  14%
    User Time    =     2.531 = 00:00:02.531 =  85%
    Process Time =     2.968 = 00:00:02.968 = 100%
    Global Time  =     2.968 = 00:00:02.968 = 100%
    
    C:\gz2unp_v0>timer gz2unp.exe enwik9.gz nul
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    
    Kernel Time  =     0.046 = 00:00:00.046 =   2%
    User Time    =     1.765 = 00:00:01.765 =  97%
    Process Time =     1.812 = 00:00:01.812 = 100%
    Global Time  =     1.812 = 00:00:01.812 = 100%
    
    C:\gz2unp_v0>timer gz2unpMT.exe enwik9.gz nul
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    g_crc32=F39E3AFB g_len32=3B9ACA00
    u_crc32=F39E3AFB u_len32=3B9ACA00
    
    Kernel Time  =     0.015 = 00:00:00.015 =   1%
    User Time    =     1.984 = 00:00:01.984 = 178%
    Process Time =     2.000 = 00:00:02.000 = 180%
    Global Time  =     1.109 = 00:00:01.109 = 100%

  14. Thanks:

    Shelwien (29th November 2019)

Similar Threads

  1. Run-Length Decoding of almost-random data
    By Alexander Rhatushnyak in forum The Off-Topic Lounge
    Replies: 2
    Last Post: 8th February 2018, 04:03
  2. understanding of deflate
    By Defplus in forum Data Compression
    Replies: 5
    Last Post: 14th July 2017, 20:48
  3. Optimal Deflate
    By FunkyBob in forum Data Compression
    Replies: 14
    Last Post: 15th June 2017, 15:46
  4. Optimizing BWT decoding
    By Lucas in forum Data Compression
    Replies: 44
    Last Post: 19th April 2017, 05:19
  5. New ASPLOS paper on SIMD FSM's and Huffman decoding
    By Paul W. in forum Data Compression
    Replies: 0
    Last Post: 22nd April 2014, 04:26

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •