Results 1 to 5 of 5

Thread: True gzip successor

  1. #1
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts

    True gzip successor

    Hi all – Is there a fundamental or theoretical reason why there isn't a better than gzip compressor that is as light on memory and CPU? According to the LTCB, gzip only uses 1.6 MB of RAM.

    If you had to use at most the resources used by gzip (memory and CPU), could you build a better compressor? Is there an insurmountable problem of computational complexity or something? Like if you use context modeling or FSE or ANS...

    What if we had full use of the modern CPU instruction set, so things like carryless multiplication, CRC32, AVX2, Bit Manipulation Instructions, SSE 4.2 string matching, etc.? And whatever their counterparts are on an ARMV8 server CPU.

    (Note that the "gzip" that Mahoney uses on the LTCB is an extremely obsolete Windows application that no longer exists, but may be based on a predecessor to the current GNU gzip. His app is from 2006 or so. I'm not clear on what the implications are for memory use and CPU, but the modern zlib gzipper will be faster, and might compress better. The Cloudflare and libdeflate implementations of gzip will be faster still, not to mention SLZ.)

    This article is interesting in that Expedia thought LZMA2 was the winner of their testing, but they discovered that with on-the-fly compression it used too much resources and gzip was actually the winner...

  2. #2
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    342
    Thanks
    197
    Thanked 58 Times in 42 Posts
    Well, many compressors do exceed the compression ratio achieved by gzip actually. What happens is, that as time has progressed and average amounts of RAM have increased in PCs, compressors have automatically allocated more memory--because they could.

    For example, LZMA easily outperforms gzip using the same 32k dictionary/block size. Lower the dictionary size and run some tests for yourself.

    Also, years ago, I did a benchmark following this same concept. Maybe you'll find it helpful:
    https://encode.su/threads/2097-The-1...sion-Benchmark

  3. Thanks:

    SolidComp (8th June 2020)

  4. #3
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,572
    Thanks
    780
    Thanked 687 Times in 372 Posts
    first, gzip is an archiver/compression format

    the classic library used for gzip compression is zlib that uses about 256 KB (64 KB for sliding window, 64-128 KB for heads, and 64-128 KB for next pointers)

    the two recent "zlib killers" are brotli and zstd. so just play with their modes. at least with zstd it should be possible to use 32-64K dictionary and quick match finders to outperform zlib while still using 256-1024KB of memory

  5. Thanks:

    SolidComp (8th June 2020)

  6. #4
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by comp1 View Post
    Well, many compressors do exceed the compression ratio achieved by gzip actually. What happens is, that as time has progressed and average amounts of RAM have increased in PCs, compressors have automatically allocated more memory--because they could.

    For example, LZMA easily outperforms gzip using the same 32k dictionary/block size. Lower the dictionary size and run some tests for yourself.

    Also, years ago, I did a benchmark following this same concept. Maybe you'll find it helpful:
    https://encode.su/threads/2097-The-1...sion-Benchmark
    Very interesting benchmark.

    Yes, I know that many compressors beat gzip ratios – I'm talking about resource usage. I want a codec that is as light on CPU and memory as gzip, but with markedly better compression ratios (and at least as fast as gzip -6).

  7. #5
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    342
    Thanks
    197
    Thanked 58 Times in 42 Posts
    Quote Originally Posted by SolidComp View Post
    Very interesting benchmark.

    Yes, I know that many compressors beat gzip ratios – I'm talking about resource usage. I want a codec that is as light on CPU and memory as gzip, but with markedly better compression ratios (and at least as fast as gzip -6).
    What I was trying to say was that many algorithms, when memory usage is decreased, easily surpass zlib/deflate (gzip) in compression ratio and decompressoin speed.

    This was accomplished about 25 years ago with Microsoft's CAB (using the LZX algorithm). If you decrease the window size to 15 (32k) it will be compressing and decompression 32k blocks just like zlib/deflate does while reaching a much greater compression ratio and a faster decompression speed. CPU usage should be relatively similar.

    I've attached CABARC.EXE for you. When compressing your files/folders, use
    Code:
    -m LZX:15
    to use the 32k window size that zlib/deflate format (gzip archiver) uses.
    Attached Files Attached Files

  8. Thanks:

    SolidComp (9th June 2020)

Similar Threads

  1. HEVC successor: Versatile Video Coding
    By SolidComp in forum Data Compression
    Replies: 1
    Last Post: 19th April 2019, 21:16
  2. gzip on a chip
    By SolidComp in forum Data Compression
    Replies: 33
    Last Post: 3rd March 2019, 04:52
  3. gzip - Intel IPP
    By M4ST3R in forum Download Area
    Replies: 5
    Last Post: 2nd June 2010, 15:09
  4. Gzip 1.2.4 hack (OpenWatcom compiles)
    By Rugxulo in forum Data Compression
    Replies: 9
    Last Post: 22nd May 2009, 00:17
  5. gzip-1.2.4-hack - a hacked version of gzip
    By encode in forum Forum Archive
    Replies: 63
    Last Post: 10th September 2007, 04:16

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •