Results 1 to 24 of 24

Thread: LZO-Professional-and-SNAPPY--VERY-FAST-COMPRESSION-are-released

  1. #1
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts

    LZO-Professional-and-SNAPPY--VERY-FAST-COMPRESSION-are-released

    http://www.oberhumer.com/products/lzo-professional/

    http://www.oberhumer.com/products/lz...onal/showcase/

    03 Mar 2011 - LZO Professional data compression library - Version 3.00

    LZO Professional

    - is an enhanced version of the OpenSource LZO GPL library.

    - is fully source and binary compatible with LZO GPL.
    Of course, the compressed data is fully compatible as well.

    - supports all major workstation, desktop and embedded architectures, including
    Alpha, AMD64, ARM, HPPA, I386, Itanium (IA64), M68K, MIPS, MIPS64,
    POWERPC, POWERPC64, S390, S390X, SH3, SH4, SPARC, SPARC64,
    X64, X86 and XSCALE.

    - improves in all aspects over the LZO GPL edition:
    .. improved compression ratio
    .. decompressors .. significantly faster, and this without using any assembly language
    .. the fast compression algorithms are much faster as well

    ---
    google anyway releases snappy 1.0.0:

    Snappy has previously been referred to as "Zippy"

    On a single core of a Core i7 processor in 64-bit mode, Snappy compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more.

    Snappy is a compression/decompression library.
    It does not aim for maximum compression ...
    it aims for very high speeds and reasonable compression.
    ---
    http://code.google.com/p/snappy/
    ---
    interesting would be a benchmark against the new lzo professional
    ---
    best regards
    Last edited by joerg; 24th March 2011 at 01:36. Reason: correct

  2. #2
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    876
    Thanks
    473
    Thanked 175 Times in 85 Posts
    is there a compiled version of snappy around?

  3. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,052 Times in 565 Posts
    http://nishi.dreamhosters.com/v/snappy_100_bin.rar

    Not the best compiler though (due to old cygwin setup), but its not portable enough to build any other way.

  4. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    Quick test: 2.0 GHz T3200, 3 GB, 32 bit Vista.

    Code:
    C:\tmp\snappy_100_bin>snappy_unittest.exe
    Running microbenchmarks.
    WARNING: Compiled with assertions enabled, will be slow.
    Benchmark            Time(ns)    CPU(ns) Iterations
    ---------------------------------------------------
    BM_UFlat/0             186000     188000        500 519.4MB/s  html
    BM_UFlat/1            1842592    1879629        108 356.2MB/s  urls
    BM_UFlat/2              39400      40600       5000 2.9GB/s  jpg
    BM_UFlat/3              82800      81200       2500 1.1GB/s  pdf
    BM_UFlat/4             741444     771863        263 506.1MB/s  html4
    BM_UFlat/5              72907      77407       2222 303.1MB/s  cp
    BM_UFlat/6              33753      35103       6666 302.9MB/s  c
    BM_UFlat/7              11850      11700      20000 303.3MB/s  lsp
    BM_UFlat/8            3278688    3065573         61 320.3MB/s  xls
    BM_UFlat/9             621451     640378        317 226.5MB/s  txt1
    BM_UFlat/10            537190     559228        363 213.5MB/s  txt2
    BM_UFlat/11           1636363    1677685        121 242.6MB/s  txt3
    BM_UFlat/12           2227272    2125000         88 216.3MB/s  txt4
    BM_UFlat/13            853448     875000        232 559.4MB/s  bin
    BM_UFlat/14            121586     121586       1538 299.9MB/s  sum
    BM_UFlat/15             14300      14100      10000 285.9MB/s  man
    BM_UFlat/16            198000     202000       1000 472.1MB/s  pb
    BM_UFlat/17            621451     593059        317 296.4MB/s  gaviota
    BM_UValidate/0          73200      74800       2500 1.3GB/s  html
    BM_UValidate/1         860262     816593        229 819.9MB/s  urls
          3 [main] snappy_unittest 3436 _cygtls::handle_exceptions: Exception: STATUS_INTEGER_DIVIDE_BY_ZERO
       3766 [main] snappy_unittest 3436 open_stackdumpfile: Dumping stack trace to snappy_unittest.exe.stackdump

  5. #5
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    in-memory test with ENWIK8 using 1 core of Athlon X4 2.8 GHz (compiled under MinGW -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math -funroll-loops --param inline-unit-growth=999):

    Code:
    snappy           = 531 ms (188323 KB/s), 100000000 -> 58350605
    lzjb_compress    = 887 ms (112739 KB/s), 100000000 -> 68711273
    fastlz1_compress = 591 ms (169204 KB/s), 100000000 -> 55239233
    fastlz2_compress = 644 ms (155279 KB/s), 100000000 -> 54163013
    lzf_compress     = 679 ms (147275 KB/s), 100000000 -> 57695415
    lzrw1_compress   = 693 ms (144300 KB/s), 100000000 -> 59669043
    lzrw1a_compress  = 664 ms (150602 KB/s), 100000000 -> 59448006
    lzrw2_compress   = 648 ms (154320 KB/s), 100000000 -> 55312164
    lzrw3_compress   = 649 ms (154083 KB/s), 100000000 -> 52468327
    quicklz          = 591 ms (169204 KB/s), 100000000 -> 52334371

  6. #6
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,052 Times in 565 Posts
    Thanks. Sure I'm too lazy, but included unittest really didn't compile otherwise

    Anyway, I've got this (see http://codepad.org/XP4FSppm ):
    Code:
    snappy           :   57.190 1219.534
    lzjb_compress    :   70.223 1438.951
    fastlz1_compress :   55.112 1155.477
    fastlz2_compress :   54.747 1133.675
    lzf_compress     :   58.241 1207.534
    lzrw1_compress   :   60.132 1248.739
    lzrw1a_compress  :   59.618 1243.823
    lzrw2_compress   :   55.803 1157.621
    lzrw3_compress   :   53.311 1098.480
    quicklz          :   52.555 1095.056
    The numbers are values of my "distribution metric" - the file is compressed, then uploaded,
    then downloaded, then decompressed; the metric is total time of that operation, with
    specific transfer speeds and number of downloads (its a simulation of file distribution via
    upload to a server).

    Here decompression speed is implied to be equal to compression speed, and transfer
    speeds are 100mbit for 1st column and 10mbit/4mbit for 2nd.

  7. #7
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @shelwin: thank you very much for the quick compile

    depending on which kind of CPU it runs - snappy produce different errors ?

    or is it a problem with operating system ?

    best regards

    ---
    running on Intel Mobile Core2Duo P7350@2.00GHz , 4 GB RAM , Vista 32 SP2

    snappy_unittest
    Running microbenchmarks.
    WARNING: Compiled with assertions enabled, will be slow.
    Benchmark Time(ns) CPU(ns) Iterations
    ---------------------------------------------------
    BM_UFlat/0 183460 177756 1052 549.4MB/s html
    BM_UFlat/1 1781818 1554545 110 430.7MB/s urls
    BM_UFlat/2 40000 40400 5000 2.9GB/s jpg
    BM_UFlat/3 82000 87200 2500 1.0GB/s pdf
    BM_UFlat/4 748120 763157 266 511.9MB/s html4
    BM_UFlat/5 71053 71053 2857 330.2MB/s cp
    BM_UFlat/6 33453 32703 6666 325.2MB/s c
    BM_UFlat/7 11850 11700 20000 303.3MB/s lsp
    BM_UFlat/8 3140625 2921875 64 336.1MB/s xls
    BM_UFlat/9 611620 620795 327 233.6MB/s txt1
    BM_UFlat/10 529729 548648 370 217.6MB/s txt2
    BM_UFlat/11 1588709 1508064 124 269.9MB/s txt3
    BM_UFlat/12 2163043 2206521 92 208.3MB/s txt4
    BM_UFlat/13 839285 834821 224 586.3MB/s bin
    BM_UFlat/14 121848 121848 1666 299.3MB/s sum
    BM_UFlat/15 14100 14000 10000 287.9MB/s man
    BM_UFlat/16 197478 196428 952 485.5MB/s pb
    BM_UFlat/17 608562 620795 327 283.2MB/s gaviota
    BM_UValidate/0 73503 70703 2857 1.3GB/s html
    BM_UValidate/1 864628 886462 229 755.3MB/s urls
    2 [main] snappy_unittest 1708 _cygtls::handle_exceptions: Exception: STATUS_INTEGER_DIVIDE_BY_ZERO
    1326 [main] snappy_unittest 1708 open_stackdumpfile: Dumping stack trace to snappy_unittest.exe.stackdump

    ---
    running on 2x Intel Xeon Nocona @2.80GHz , 4 GB RAM , Win2003 SP2

    snappy_unittest
    Running microbenchmarks.
    WARNING: Compiled with assertions enabled, will be slow.
    Benchmark Time(ns) CPU(ns) Iterations
    ---------------------------------------------------
    BM_UFlat/0 249600 250400 1250 390.0MB/s html
    BM_UFlat/1 2537500 2537500 80 263.9MB/s urls
    3 [main] snappy_unittest 7360 _cygtls::handle_exceptions: Exception: STATUS_INTEGER_DIVIDE_BY_ZERO
    15736 [main] snappy_unittest 7360 open_stackdumpfile: Dumping stack trace to snappy_unittest.exe.stackdump
    Last edited by joerg; 24th March 2011 at 14:44.

  8. #8
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,423
    Thanks
    223
    Thanked 1,052 Times in 565 Posts
    @joerg: its probably not intended for 32-bit builds either - that looks like 32-bit overflow in speed etc calc.

  9. #9
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @shelwien: Can you please try to compile a Win-x64-binary ?

    May be in that case this kind of error will disappeare ...

    best regards

    Joerg

  10. #10
    Member
    Join Date
    Mar 2011
    Location
    Google Switzerland
    Posts
    19
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi,

    There are some fixes for running the Snappy unit test natively on Windows (ie., using mingw32) in the pipeline. The library itself should work just fine; these are mainly issues with timing and the likes.

    /* Steinar */

  11. #11
    Member
    Join Date
    Mar 2011
    Location
    Google Switzerland
    Posts
    19
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by inikep View Post
    in-memory test with ENWIK8 using 1 core of Athlon X4 2.8 GHz (compiled under MinGW -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math -funroll-loops --param inline-unit-growth=999):
    FWIW, these flags are probably overkill for Snappy. Generally I haven't seen much speed-up with -O3 over -O2, and -funroll-loops (implied by -O3) might actually be harmful. A good starting point is probably -O2 -fomit-frame-pointer -fstrict-aliasing, but I believe both -fstrict-aliasing and -fomit-frame-pointer are default on x86/x86-64 for the newest GCC versions.

    Also turning off assertions (-DNDEBUG) is essential if you're not already doing that; Snappy has a lot of assertions to assert correctness, which is very useful during development but not a good idea for maximum speed.

    /* Steinar */

  12. #12
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    gcc version 3.4.5 (mingw-vista special r3)
    -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math --param inline-unit-growth=999
    Code:
    snappy           = 526 ms (190114 KB/s), 100000000 -> 58350605
    lzjb_compress    = 896 ms (111607 KB/s), 100000000 -> 68711273
    fastlz1_compress = 625 ms (160000 KB/s), 100000000 -> 55239233
    fastlz2_compress = 654 ms (152905 KB/s), 100000000 -> 54163013
    lzf_compress     = 664 ms (150602 KB/s), 100000000 -> 57695415
    lzrw1_compress   = 689 ms (145137 KB/s), 100000000 -> 59669043
    lzrw1a_compress  = 653 ms (153139 KB/s), 100000000 -> 59448006
    lzrw2_compress   = 622 ms (160771 KB/s), 100000000 -> 55312164
    lzrw3_compress   = 619 ms (161550 KB/s), 100000000 -> 52468327
    quicklz-1        = 519 ms (192678 KB/s), 100000000 -> 52334371
    -O2 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math --param inline-unit-growth=999
    Code:
    snappy           = 542 ms (184501 KB/s), 100000000 -> 58350605
    lzjb_compress    = 890 ms (112359 KB/s), 100000000 -> 68711273
    fastlz1_compress = 637 ms (156985 KB/s), 100000000 -> 55239233
    fastlz2_compress = 745 ms (134228 KB/s), 100000000 -> 54163013
    lzf_compress     = 718 ms (139275 KB/s), 100000000 -> 57695415
    lzrw1_compress   = 727 ms (137551 KB/s), 100000000 -> 59669043
    lzrw1a_compress  = 684 ms (146198 KB/s), 100000000 -> 59448006
    lzrw2_compress   = 709 ms (141043 KB/s), 100000000 -> 55312164
    lzrw3_compress   = 655 ms (152671 KB/s), 100000000 -> 52468327
    quicklz-1        = 585 ms (170940 KB/s), 100000000 -> 52334371
    -O2 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math -funroll-loops --param inline-unit-growth=999
    Code:
    snappy           = 538 ms (185873 KB/s), 100000000 -> 58350605
    lzjb_compress    = 1021 ms (97943 KB/s), 100000000 -> 68711273
    fastlz1_compress = 636 ms (157232 KB/s), 100000000 -> 55239233
    fastlz2_compress = 751 ms (133155 KB/s), 100000000 -> 54163013
    lzf_compress     = 706 ms (141643 KB/s), 100000000 -> 57695415
    lzrw1_compress   = 717 ms (139470 KB/s), 100000000 -> 59669043
    lzrw1a_compress  = 680 ms (147058 KB/s), 100000000 -> 59448006
    lzrw2_compress   = 726 ms (137741 KB/s), 100000000 -> 55312164
    lzrw3_compress   = 656 ms (152439 KB/s), 100000000 -> 52468327
    quicklz-1        = 576 ms (173611 KB/s), 100000000 -> 52334371
    Last edited by inikep; 24th March 2011 at 19:10.

  13. #13
    Member
    Join Date
    Mar 2011
    Location
    Google Switzerland
    Posts
    19
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Sesse

    Interesting, but you are using an essentially seven year old version of GCC (gcc 3.4.0 came out April 2004, and the latter 3.4.x releases are bugfixes only), and as far as I understand your flags, you're still missing -DNDEBUG.

    Are these results repeatable? Ie., how much does it vary between runs?

    /* Steinar */

  14. #14
    Member
    Join Date
    Mar 2011
    Location
    Google Switzerland
    Posts
    19
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Sesse View Post
    There are some fixes for running the Snappy unit test natively on Windows (ie., using mingw32) in the pipeline. The library itself should work just fine; these are mainly issues with timing and the likes.
    This has hit Subversion now, as of r18.

    /* Steinar */

  15. #15
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Of course I'm using -DNDEBUG, results are repeatable, difference is about 1-2%, gcc version 3.4.5 was default in older versions of MinGW, but I will try the newest

  16. #16
    Member
    Join Date
    Mar 2011
    Location
    Google Switzerland
    Posts
    19
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by inikep View Post
    Of course I'm using -DNDEBUG, results are repeatable, difference is about 1-2%, gcc version 3.4.5 was default in older versions of MinGW, but I will try the newest
    Thanks for the information. It's a useful data point (especially for 32-bit), despite the old compiler.

    /* Steinar */

  17. #17
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Quote Originally Posted by Sesse View Post
    Thanks for the information. It's a useful data point (especially for 32-bit), despite the old compiler.
    More results with GCC 4.5.2 and Intel Compiler is here:
    http://encode.su/threads/1255-Google...ession-library

  18. #18
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @inikep: may be the results with the new gcc 4.6.0 will be better?

    http://gcc.gnu.org/gcc-4.6/

    new Optimization/support for

    Intel Core 2 : -march=core2 and -mtune=core2
    Intel Core i3/i5/i7 : -march=corei7 and -mtune=corei7
    Intel Core i3/i5/i7 processors with AVX : -march=corei7-avx and -mtune=corei7-avx

    best regards
    joerg

  19. #19
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    GCC 4.5.2 is the latest version for MinGW

  20. #20
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,475
    Thanks
    26
    Thanked 121 Times in 95 Posts
    What about Cygwin?

  21. #21
    Programmer
    Join Date
    May 2008
    Location
    denmark
    Posts
    94
    Thanks
    0
    Thanked 2 Times in 2 Posts

    QuickLZ 1.5.1 and 1.6.0 experimental

    Hey there

    Please try the latest 1.5.1 beta from march 29'th instead of 1.5.0. Increased average compression speed from 308 to 341 MB/s on 64-bit x64 in level 1.

    1.5.0 beta:
    http://quicklz.com/beta.html
    Last edited by Lasse Reinhold; 3rd April 2011 at 15:23.

  22. #22
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    In-memory test (compression and decompression) with ENWIK8 using 1 core of Intel Xeon X5355 @ 2.66GHz (64-bit compilation under gcc 4.1.1 (Linux) -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math --param inline-unit-growth=999 -DNDEBUG -march=nocona):

    Code:
    snappy 1.0       = 472 ms (206 MB/s), 100000000 -> 58350605, 206 ms (474 MB/s)
    quicklz 1.5.0 -1 = 490 ms (199 MB/s), 100000000 -> 52334371, 502 ms (194 MB/s)
    quicklz 1.5.1 -1 = 470 ms (207 MB/s), 100000000 -> 52334371, 509 ms (191 MB/s)

  23. #23
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    "GCC 4.5.2 is the latest version for MinGW"

    - it is right, but someone has done a pre-release "mingw with gcc 4.6" ...

    http://www.xvidvideo.ru/component/do...e-release.html

    gcc 4.6 has new optimization/support for

    Intel Core 2 : -march=core2 and -mtune=core2
    Intel Core i3/i5/i7 : -march=corei7 and -mtune=corei7
    Intel Core i3/i5/i7 processors with AVX : -march=corei7-avx and -mtune=corei7-avx

  24. #24
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    15 Sep 2011: new version: http://snappy.googlecode.com/files/snappy-1.0.4.tar.gz

    16 Sep 2011: new lzo professional Version 3.1 is released

    but i cant find any details, tests or binary

    ---

    the open source variant lzo 2.06 is from 12 Aug 2011 - but there is too no official binary

    http://www.oberhumer.com/products/lzo-professional/

    LZO Professional is fully source and binary compatible with LZO GPL. Of course, the compressed data is fully compatible as well.

    LZO Professional improves in all aspects over the LZO GPL edition:

    the compression levels for generating pre-compressed data achieve an improved compression ratio, thereby also improving decompression speed
    the decompressors - although blindingly fast right now - are still significantly faster, and this without using any assembly language
    the fast compression algorithms are much faster as well

    LZO Professional supports all major workstation, desktop and embedded architectures, including Alpha, AMD64, ARM, HPPA, I386, Itanium (IA64), M68K, MIPS, MIPS64, POWERPC, POWERPC64, S390, S390X, SH3, SH4, SPARC, SPARC64, X64, X86 and XSCALE.

    LZO Professional will be made available to interested parties as a binary-only evaluation library under a Non-Disclosure Agreement (NDA)

Similar Threads

  1. Zhuff - fast compression
    By Cyan in forum Data Compression
    Replies: 38
    Last Post: 5th February 2014, 11:27
  2. Google released Snappy compression/decompression library
    By Sportman in forum Data Compression
    Replies: 11
    Last Post: 16th May 2011, 13:31
  3. LZO 2.03 (30 Apr 2008) released
    By joerg in forum Data Compression
    Replies: 10
    Last Post: 28th April 2011, 01:13
  4. Fast LZ compression
    By encode in forum Forum Archive
    Replies: 35
    Last Post: 25th April 2007, 02:35
  5. Fast arithcoder for compression of LZ77 output
    By Bulat Ziganshin in forum Forum Archive
    Replies: 13
    Last Post: 15th April 2007, 18:40

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •