Page 2 of 3 FirstFirst 123 LastLast
Results 31 to 60 of 65

Thread: Google: Compress Data More Densely with Zopfli

  1. #31
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,040
    Thanks
    104
    Thanked 420 Times in 293 Posts
    Quote Originally Posted by comp1 View Post
    Can someone else do some benchmarks?
    Not someone else but new benchmark:

    Input SSD:
    7,313,400,458 bytes, 28 IIS logfiles

    Output SSD:
    534,300,429 bytes gzip -9
    534,283,890 bytes gzip -9 (one file)
    532,877,929 bytes arc zip maximum -mx9 (one file)
    527,866,709 bytes rar zip best (one file)
    520,810,577 bytes 7z zip ultra (one file)
    511,298,927 bytes zopfli default
    511,080,685 bytes kzip xtreme default (one file)

  2. #32
    Member caveman's Avatar
    Join Date
    Jul 2009
    Location
    Strasbourg, France
    Posts
    190
    Thanks
    8
    Thanked 64 Times in 33 Posts
    I've found a parameter that can be tweaked to slightly modify Zopfli bevaviour, in lz77.c:
    Code:
    /*
    Gets the value of the length given the distance. Typically, the value of the
    length is the length, but if the distance is very long, decrease the value of
    the length a bit to make up for the fact that long distances use large amounts
    of extra bits.
    */
    static int GetLengthValue(int length, int distance) {
      /*
      At distance > 1024, using length 3 is no longer good, due to the large amount
      of extra bits for the distance code. distance > 1024 uses 9+ extra bits, and
      this seems to be the sweet spot.
      */
      return distance > 1024 ? length - 1 : length;
    }
    Replacing 1024 in the line "return distance > 1024 ? length - 1 : length;" by 512, 768, 1536, 2048, 3072 or 4096 will produce a different file (even block splitting is affected).
    For instance on the file deflate.c it gives this (size in bytes):
    Code:
    5459 deflate.c-zop0768.gz
    5459 deflate.c-zop1024.gz
    5457 deflate.c-zop0512.gz
    5457 deflate.c-zop1536.gz
    
    5456 deflate.c-zop0768-i100.gz
    5456 deflate.c-zop1024-i100.gz
    5454 deflate.c-zop0512-i100.gz
    5453 deflate.c-zop1536-i100.gz
    
    5455 deflate.c-zop0768-i1000.gz
    5455 deflate.c-zop1024-i1000.gz
    5453 deflate.c-zop1536-i1000.gz
    5452 deflate.c-zop0512-i1000.gz
    Using 512 or 1536 produces a slightly smaller file.
    Applying deflopt -b still saves a few bytes:
    Code:
    5455 deflate.c-zop0768d.gz
    5455 deflate.c-zop1024d.gz
    5453 deflate.c-zop0512d.gz
    5453 deflate.c-zop1536d.gz
    
    5455 deflate.c-zop0768-i100d.gz
    5454 deflate.c-zop1024-i100d.gz
    5453 deflate.c-zop0512-i100d.gz
    5452 deflate.c-zop1536-i100d.gz
    
    5454 deflate.c-zop0768-i1000d.gz
    5453 deflate.c-zop1024-i1000d.gz
    5452 deflate.c-zop1536-i1000d.gz
    5451 deflate.c-zop0512-i1000d.gz
    On a larger file (book1):
    Code:
    299233 book1-zop1024.gz
    299171 book1-zop0768.gz
    299161 book1-zop0512.gz
    299043 book1-zop2048.gz
    299019 book1-zop3072.gz
    299015 book1-zop1536.gz
    298761 book1-zop4096.gz
    
    299165 book1-zop0768-i100.gz
    299159 book1-zop0512-i100.gz
    299148 book1-zop1024-i100.gz
    298766 book1-zop2048-i100.gz
    298760 book1-zop3072-i100.gz
    298752 book1-zop4096-i100.gz
    298740 book1-zop1536-i100.gz
    
    again after deflopt -b:
    299225 book1-zop1024d.gz
    299160 book1-zop0768d.gz
    299151 book1-zop0512d.gz
    299036 book1-zop2048d.gz
    299013 book1-zop3072d.gz
    299008 book1-zop1536d.gz
    298751 book1-zop4096d.gz
    
    299154 book1-zop0768-i100d.gz
    299149 book1-zop0512-i100d.gz
    299140 book1-zop1024-i100d.gz
    298759 book1-zop2048-i100d.gz
    298757 book1-zop3072-i100d.gz
    298746 book1-zop4096-i100d.gz
    298733 book1-zop1536-i100d.gz
    I don't know against what they trained Zopfli to pick 1024, but apparently it was not these two files.
    This is pretty similar to the TOO_FAR story in Zlib: http://optipng.sourceforge.net/pngtech/too_far.html

    Oh, I forgot:
    Code:
    huffmix -v book1-zop4096-i100d.gz book1-zop4096d.gz book1-zop4096-mix.gz
    book1-zop4096-i100.gz (298746 bytes)
    
    Block boundaries: 0,9a3,415f (3 blocks)
    
    book1-zop4096.gz (298751 bytes)
    
    Block boundaries: 0,9a3,415f (3 blocks)
    
     File Type C-Offset C-Length U-Offset U-Length
       A    2         0    10608        0     2467
       A    2     10608    49202      9a3    14268
       A    2     59810  2330011     415f   752036
    
     File Type C-Offset C-Length U-Offset U-Length
       B    2         0    10604        0     2467
       B    2     10604    49253      9a3    14268
       B    2     59857  2330005     415f   752036
    
     File C-Offset C-Length
       B         0    10604
       A     10608    49202
       B     59857  2330005
    
    Saved 10 bits, output file size 298745 bytes
    And it strikes me right now: why do they remove systematically one and not only if length is small (3 or 4)?! Could be a bug I'll check it tomorrow.
    Last edited by caveman; 6th March 2013 at 04:45. Reason: spell checking

  3. #33
    Member
    Join Date
    Apr 2011
    Location
    Russia
    Posts
    168
    Thanks
    163
    Thanked 9 Times in 8 Posts
    good afternoon!
    Nobody met the program of compression zip based on Zopfli?

  4. #34
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I looked in the Zopfli sources and there was some Zip code. Maybe it's incomplete? Maybe it just has to be plugged in? Dunno.

  5. #35
    Member
    Join Date
    Apr 2011
    Location
    Russia
    Posts
    168
    Thanks
    163
    Thanked 9 Times in 8 Posts
    Not absolutely you I understood. Me interests as it is possible to create zip archive by means of Zopfli?

  6. #36
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I think I understood you.
    However now I see that I was wrong, there's no zip but zlib mode in zopfli.

  7. #37
    Member
    Join Date
    May 2008
    Location
    HK
    Posts
    160
    Thanks
    4
    Thanked 25 Times in 15 Posts
    Quote Originally Posted by m^2 View Post
    I think I understood you.
    However now I see that I was wrong, there's no zip but zlib mode in zopfli.
    never mind, we can look how advancecomp/minizip works
    Last edited by roytam1; 11th March 2013 at 04:18.

  8. #38
    Member Jaff's Avatar
    Join Date
    Oct 2012
    Location
    Dracula's country
    Posts
    104
    Thanks
    116
    Thanked 22 Times in 18 Posts
    Can anyone make a .ZIP archive recompresor/optimiser using ZopFli?
    Last edited by Jaff; 14th March 2013 at 18:40.

  9. #39
    Member
    Join Date
    May 2008
    Location
    HK
    Posts
    160
    Thanks
    4
    Thanked 25 Times in 15 Posts

  10. #40
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Doesn't FreeArc use pigz?

  11. #41
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    Test results with pigz -11. It is faster than kzip (but still slow) with slightly better compression. http://mattmahoney.net/dc/text.html#3098

  12. #42
    Member Jaff's Avatar
    Join Date
    Oct 2012
    Location
    Dracula's country
    Posts
    104
    Thanks
    116
    Thanked 22 Times in 18 Posts
    Can you please post compiled .exe of pigz? Thank you!

  13. #43
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    I was only able to compile it in Linux. There's a lot of UNIX specific stuff in the source.

  14. #44
    Member
    Join Date
    Oct 2009
    Location
    usa
    Posts
    62
    Thanks
    1
    Thanked 9 Times in 6 Posts
    Well I know for sure there is a win32 compile of pigz, I downloaded it awhile ago. Hopefully that same person can make do a new win32 compile of the new Zopfli code.

    Does this mean that we can use our 8-core processors to speed up Zopfli implementation with pigz by means of parallel processing? Further, since pigz will support .ZIP archive format, it would seem we could then create .ZIP archives with multiple files as well...

  15. #45
    Member
    Join Date
    Mar 2013
    Location
    Windowless Basement
    Posts
    9
    Thanks
    0
    Thanked 1 Time in 1 Post
    I've just made a win32 build of pigz 2.3, compiled with mingw gcc 4.7.2, pthreads 2.9.0 and zlib 1.2.7

    I've included a patch file of the changes made, but I'll summarise the important one here:
    - MinGW doesn't seem to support symlinks, at least with sys/stat.h Windows symlinks should still work, but this build shouldn't handle them differently to ordinary files (ie it won't skip symlinks)

    Cygwin may or may not do things better - I don't know as I don't have it installed.

    Hope that helps!
    Attached Files Attached Files
    Last edited by DotDotDot; 21st March 2013 at 15:27.

  16. Thanks:

    Jaff (3rd June 2016)

  17. #46
    Member
    Join Date
    May 2009
    Location
    France
    Posts
    99
    Thanks
    13
    Thanked 75 Times in 45 Posts
    You could have changed utimes to utime...

    $ diff pigz.c pigz.c.orig
    291,292d290
    < #define S_IFLNK 0xa000
    <
    313d310
    < #include <sys/utime.h>
    3044c3041
    < struct utimbuf times;
    ---
    > struct timeval times[2];
    3054c3051
    < //(void)chown(to, st.st_uid, st.st_gid);
    ---
    > (void)chown(to, st.st_uid, st.st_gid);
    3057,3059c3054,3058
    < times.actime = st.st_atime;
    < times.modtime = st.st_mtime;
    < (void)utime(to, &times);
    ---
    > times[0].tv_sec = st.st_atime;
    > times[0].tv_usec = 0;
    > times[1].tv_sec = st.st_mtime;
    > times[1].tv_usec = 0;
    > (void)utimes(to, times);
    3065c3064
    < struct utimbuf times;
    ---
    > struct timeval times[2];
    3067,3069c3066,3070
    < times.actime = t;
    < times.modtime = t;
    < (void)utime(path, &times);
    ---
    > times[0].tv_sec = t;
    > times[0].tv_usec = 0;
    > times[1].tv_sec = t;
    > times[1].tv_usec = 0;
    > (void)utimes(path, times);
    3103c3104
    < if (stat(g.inf, &st)) {
    ---
    > if (lstat(g.inf, &st)) {
    3111c3112
    < } while (stat(g.inf, &st) && errno == ENOENT);
    ---
    > } while (lstat(g.inf, &st) && errno == ENOENT);

  18. #47
    Member
    Join Date
    Mar 2013
    Location
    Windowless Basement
    Posts
    9
    Thanks
    0
    Thanked 1 Time in 1 Post
    Thanks for the suggestion AiZ! I clearly missed that.
    Updated above post with newer version which sets the timestamps on the files.

  19. #48
    Member
    Join Date
    May 2009
    Location
    France
    Posts
    99
    Thanks
    13
    Thanked 75 Times in 45 Posts
    Last one.

    Download pthreads-win32 tarball where you previously have downloaded and uncompressed pigz tarball, uncompress it and compile it with:
    $ make clean GC-static
    Then you have created the libpthreadGC2.a archive.

    $ diff Makefile Makefile.orig
    1,2c1,2
    < CC=gcc -DPTW32_STATIC_LIB
    < CFLAGS=-O3 -mtune=generic -Wall -Wextra -posix
    ---
    > CC=cc
    > CFLAGS=-O3 -Wall -Wextra
    6c6
    < $(CC) -static -o pigz $^ -lz ../pthreads-w32-2-9-1-release/libpthreadGC2.a
    ---
    > $(CC) -o pigz $^ -lpthread -lz
    77c77
    < @rm -f *.o zopfli/*.o pigz.exe unpigz.exe pigzn pigzt pigz.c.gz pigz.c.zz pigz.c.zip
    ---
    > @rm -f *.o zopfli/*.o pigz unpigz pigzn pigzt pigz.c.gz pigz.c.zz pigz.c.zip


    Now pigz.exe is really "static".

  20. #49
    Member
    Join Date
    May 2008
    Location
    HK
    Posts
    160
    Thanks
    4
    Thanked 25 Times in 15 Posts
    @AiZ: I think you should use "diff -U5 origfile modfile" for better diff output.

  21. #50
    Member
    Join Date
    Mar 2013
    Location
    Windowless Basement
    Posts
    9
    Thanks
    0
    Thanked 1 Time in 1 Post
    Thanks again for the tip AiZ!
    I didn't grab the source code for pthread, just used mingw get, which doesn't seem to have a static version. The DLL doesn't really bother me personally, but if anyone does really want a fully static build, your instructions will be invaluable

    I'll keep that in mind for any future builds - thanks again!

  22. #51
    Member
    Join Date
    May 2009
    Location
    France
    Posts
    99
    Thanks
    13
    Thanked 75 Times in 45 Posts
    Pngwolf linked against zopfli (instead of 7-zip).

    pngwolf_zopfli_win32.7z

    Barely tested, be warned.

  23. Thanks (2):

    Jaff (13th October 2015),lorents17 (30th April 2015)

  24. #52
    Member nikkho's Avatar
    Join Date
    Jul 2011
    Location
    Spain
    Posts
    554
    Thanks
    223
    Thanked 166 Times in 107 Posts
    Quote Originally Posted by AiZ View Post
    Pngwolf linked against zopfli (instead of 7-zip).
    Thank you very much.

  25. #53
    Member Jaff's Avatar
    Join Date
    Oct 2012
    Location
    Dracula's country
    Posts
    104
    Thanks
    116
    Thanked 22 Times in 18 Posts
    Can anybody do the "magic" again and compile latest pigz version? A win32 build would be great. Thank you!

  26. #54
    Member przemoc's Avatar
    Join Date
    Aug 2011
    Location
    Poland
    Posts
    44
    Thanks
    3
    Thanked 23 Times in 13 Posts

    pigz 2.3.3 for Windows

    I've built pigz 2.3.3 on my W7 x64 in up-to-date MSYS2's MinGW-w64 Win32/64 Shell using recent:
    Code:
    gcc version 5.3.0 (Rev1, Built by MSYS2 project)
    after applying changes in pigz.c based on AiZ's suggestions:
    https://gist.github.com/przemoc/bd6c342ff7f31a0c4ec2

    and modifying Makefile in usual fashion:
    • setting CC to "gcc",
    • expanding LDFLAGS with "-static -static-libgcc",
    • expanding CFLAGS with "-D__USE_MINGW_ANSI_STDIO",
    • moving "-lz" to the end of compile+link $(CC) line.


    Final exe has been stripped.

    Seems to work, but USE AT YOUR OWN RISK! NO WARRANTY!
    Attached Files Attached Files

  27. Thanks (4):

    comp1 (4th February 2016),Jaff (5th February 2016),lorents17 (23rd February 2016),spark (29th February 2016)

  28. #55
    Member
    Join Date
    Apr 2011
    Location
    Russia
    Posts
    168
    Thanks
    163
    Thanked 9 Times in 8 Posts
    Zopfli has the function OptimizeHuffmanForRle which is used to improve the compression of huffman trees. Brotli has an improved, but compatible version.
    How to copy this function in zopfli?

    https://github.com/google/zopfli/issues/93

  29. #56
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    976
    Thanks
    266
    Thanked 350 Times in 221 Posts
    I came up with that hack and wrote both of these functions originally. Last week, I played with this for about 3 hours (trying to fit the function from brotli to zopfli), and couldn't get savings in my benchmarks. Sometimes it is smaller, sometimes larger. The coding of Huffman codes is different in brotli and deflate, so the manipulation of the rle coding reflects this, and this is the main reason why brotli's code is different from zopfli.

  30. #57
    Member
    Join Date
    Apr 2011
    Location
    Russia
    Posts
    168
    Thanks
    163
    Thanked 9 Times in 8 Posts
    You can get an example code, I too would like to test?

  31. #58
    Member
    Join Date
    Apr 2011
    Location
    Russia
    Posts
    168
    Thanks
    163
    Thanked 9 Times in 8 Posts
    Prompt how will apply this patch?
    https://github.com/google/zopfli/iss...ment-189691332

  32. #59
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    976
    Thanks
    266
    Thanked 350 Times in 221 Posts
    Quote Originally Posted by lorents17 View Post
    How often do you see savings in your benchmarks? What kind of files get smaller?

  33. #60
    Member
    Join Date
    Apr 2011
    Location
    Russia
    Posts
    168
    Thanks
    163
    Thanked 9 Times in 8 Posts
    Unfortunately, I can not carry out the test, because I can not apply the patch. if somebody applies given the patch and will send files, I am happy to do the tests

Page 2 of 3 FirstFirst 123 LastLast

Similar Threads

  1. loseless data compression method for all digital data type
    By rarkyan in forum Random Compression
    Replies: 253
    Last Post: 21st October 2020, 04:44
  2. Google released Snappy compression/decompression library
    By Sportman in forum Data Compression
    Replies: 11
    Last Post: 16th May 2011, 13:31
  3. Interested in Google-Wave?
    By Vacon in forum The Off-Topic Lounge
    Replies: 2
    Last Post: 29th November 2009, 20:11
  4. Compress-LZF
    By spark in forum Data Compression
    Replies: 2
    Last Post: 16th October 2009, 01:08
  5. Did you know the google hashmap
    By thometal in forum Forum Archive
    Replies: 0
    Last Post: 4th February 2007, 16:21

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •