Page 1 of 4 123 ... LastLast
Results 1 to 30 of 110

Thread: In-memory benchmark with fastest LZSS (QuickLZ, Snappy) compressors

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts

    lzbench - an in-memory benchmark of open-source LZ77/LZSS/LZMA compressors

    Attached is an in-memory benchmark (with source code) of the fastest open-source LZ77/LZSS compressors. I've joined all compressors into a single exe. At the beginning an input file is read to memory. Then all compressors are used to compress and decompress the file and decompressed file is verified. The "i" option selects number of iterations (default 1) and displays average time of N iterations.

    This approach has a big advantage of using the same compiler with the same optimizations for all compressors. The disadvantage is that it requires source code of each compressor (therefore Slug or LZ4 are not included).
    Attached Files Attached Files
    Last edited by inikep; 19th April 2016 at 10:35.

  2. #2
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    In-memory test using 1 core of Intel Xeon X5355 @ 2.66GHz (64-bit compilation under gcc 4.1.1 (Linux) -O2 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math --param inline-unit-growth=999 -DNDEBUG):

    Code:
    benchmark 0.1 (c) Dell Inc.  Written by P.Skibinski
    memcpy           = 97 ms (1055 MB/s), 104854004 -> 104854004
    fastlz 0.1 -1    = 545 ms (187 MB/s), 104854004 -> 45614322, 308 ms (332 MB/s)
    fastlz 0.1 -2    = 557 ms (183 MB/s), 104854004 -> 43986331, 249 ms (411 MB/s)
    lzf 3.6 vf       = 502 ms (203 MB/s), 104854004 -> 44890314, 229 ms (447 MB/s)
    lzf 3.6 uf       = 495 ms (206 MB/s), 104854004 -> 47089435, 233 ms (439 MB/s)
    lzjb 2010        = 600 ms (170 MB/s), 104854004 -> 52693883, 315 ms (325 MB/s)
    lzo 2.04 1x      = 774 ms (132 MB/s), 104854004 -> 42726420, 217 ms (471 MB/s)
    lzrw1            = 596 ms (171 MB/s), 104854004 -> 51296084, 379 ms (270 MB/s)
    lzrw1-a          = 546 ms (187 MB/s), 104854004 -> 50630870, 302 ms (339 MB/s)
    lzrw2            = 491 ms (208 MB/s), 104854004 -> 47950899, 295 ms (347 MB/s)
    lzrw3            = 535 ms (191 MB/s), 104854004 -> 46384103, 351 ms (291 MB/s)
    snappy 1.0       = 303 ms (337 MB/s), 104854004 -> 46155676, 186 ms (550 MB/s)
    tornado 0.666 -1 = 540 ms (189 MB/s), 104854004 -> 47432525, 406 ms (252 MB/s)
    tornado 0.666 -2 = 674 ms (151 MB/s), 104854004 -> 40500470, 464 ms (220 MB/s)
    quicklz 1.5.0 -2 = 880 ms (116 MB/s), 104854004 -> 38965498, 479 ms (213 MB/s)
    quicklz 1.5.0 -1 = 383 ms (267 MB/s), 104854004 -> 42816655, 410 ms (249 MB/s)
    quicklz 1.5.1 -1 = 371 ms (276 MB/s), 104854004 -> 42816655, 418 ms (244 MB/s)
    all              = 17871 ms
    In-memory test using 1 core of Athlon X4 2.8 GHz (32-bit compilation under gcc 4.5.2 (MinGW) -O2 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math --param inline-unit-growth=999 -DNDEBUG -march=k:
    Code:
    benchmark 0.1 (c) Dell Inc.  Written by P.Skibinski
    memcpy           = 67 ms (1528 MB/s), 104854004 -> 104854004
    fastlz 0.1 -1    = 487 ms (210 MB/s), 104854004 -> 45614322, 282 ms (363 MB/s)
    fastlz 0.1 -2    = 505 ms (202 MB/s), 104854004 -> 43986331, 218 ms (469 MB/s)
    lzf 3.6 vf       = 542 ms (188 MB/s), 104854004 -> 44890314, 202 ms (506 MB/s)
    lzf 3.6 uf       = 519 ms (197 MB/s), 104854004 -> 47089435, 204 ms (501 MB/s)
    lzjb 2010        = 627 ms (163 MB/s), 104854004 -> 52693883, 330 ms (310 MB/s)
    lzo 2.04 1x      = 796 ms (128 MB/s), 104854004 -> 42726420, 216 ms (474 MB/s)
    lzrw1            = 562 ms (182 MB/s), 104854004 -> 51296084, 276 ms (371 MB/s)
    lzrw1-a          = 538 ms (190 MB/s), 104854004 -> 50630870, 299 ms (342 MB/s)
    lzrw2            = 552 ms (185 MB/s), 104854004 -> 47950899, 386 ms (265 MB/s)
    lzrw3            = 591 ms (173 MB/s), 104854004 -> 46384103, 430 ms (238 MB/s)
    snappy 1.0       = 306 ms (334 MB/s), 104854004 -> 46155676, 147 ms (696 MB/s)
    tornado 0.666 -1 = 511 ms (200 MB/s), 104854004 -> 47432525, 346 ms (295 MB/s)
    tornado 0.666 -2 = 703 ms (145 MB/s), 104854004 -> 40500470, 487 ms (210 MB/s)
    quicklz 1.5.0 -2 = 1023 ms (100 MB/s), 104854004 -> 38965498, 344 ms (297 MB/s)
    quicklz 1.5.0 -1 = 392 ms (261 MB/s), 104854004 -> 42816655, 293 ms (349 MB/s)
    quicklz 1.5.1 -1 = 366 ms (279 MB/s), 104854004 -> 42816655, 293 ms (349 MB/s)
    all              = 15781 ms
    The input file (100 MB) is a concatenation of 10 different files, about 10 MB each: bmp, dct_coeffs, english_dic, ENWIK, exe, fp_log, hlp, XML, pdf, ncb.
    Last edited by inikep; 7th April 2011 at 14:18.

  3. #3
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    New results from Athlon X4 2.8 GHz (32-bit compilation under gcc 4.5.2 (MinGW)), 3 iterations, added LZ4:
    Code:
    memcpy              = 53 ms (1932 MB/s), 104854004->104854004
    fastlz 0.1 -1       = 497 ms (206 MB/s), 104854004->45614322, 233 ms (439 MB/s)
    fastlz 0.1 -2       = 524 ms (195 MB/s), 104854004->43986331, 218 ms (469 MB/s)
    LZ4                 = 579 ms (176 MB/s), 104854004->44774337, 149 ms (687 MB/s)
    lzf 3.6 vf          = 549 ms (186 MB/s), 104854004->44890314, 201 ms (509 MB/s)
    lzf 3.6 uf          = 508 ms (201 MB/s), 104854004->47089435, 203 ms (504 MB/s)
    lzjb 2010           = 637 ms (160 MB/s), 104854004->52693883, 325 ms (315 MB/s)
    lzo 2.04 1x         = 810 ms (126 MB/s), 104854004->42726420, 224 ms (457 MB/s)
    lzo 2.05 1x         = 243 ms (421 MB/s), 104854004->51883722, 192 ms (533 MB/s)
    lzrw1               = 581 ms (176 MB/s), 104854004->51296084, 281 ms (364 MB/s)
    lzrw1-a             = 552 ms (185 MB/s), 104854004->50630870, 289 ms (354 MB/s)
    lzrw2               = 556 ms (184 MB/s), 104854004->47950899, 391 ms (261 MB/s)
    lzrw3               = 599 ms (170 MB/s), 104854004->46384103, 432 ms (237 MB/s)
    snappy 1.0          = 307 ms (333 MB/s), 104854004->46155676, 150 ms (682 MB/s)
    tornado 0.666 -1    = 522 ms (196 MB/s), 104854004->47432525, 352 ms (290 MB/s)
    tornado 0.666 -2    = 719 ms (142 MB/s), 104854004->40500470, 498 ms (205 MB/s)
    quicklz 1.5.0 -2    = 1051 ms (97 MB/s), 104854004->38965498, 351 ms (291 MB/s)
    quicklz 1.5.0 -1    = 386 ms (265 MB/s), 104854004->42816655, 301 ms (340 MB/s)
    quicklz 1.5.1 -1    = 374 ms (273 MB/s), 104854004->42816655, 299 ms (342 MB/s)
    Last edited by inikep; 26th April 2011 at 11:34. Reason: added lzo 2.05

  4. #4
    Member
    Join Date
    May 2008
    Location
    HK
    Posts
    160
    Thanks
    4
    Thanked 25 Times in 15 Posts
    Quote Originally Posted by inikep View Post
    New results from Athlon X4 2.8 GHz (32-bit compilation under gcc 4.5.2 (MinGW)), 3 iterations, added LZ4, LZO 2.05:
    A sorted by Decompression Time over Compression Ratio list:
    Click image for larger version. 

Name:	xls.png 
Views:	1615 
Size:	14.8 KB 
ID:	1565

  5. #5
    Member Alexander Rhatushnyak's Avatar
    Join Date
    Oct 2007
    Location
    Canada
    Posts
    252
    Thanks
    49
    Thanked 107 Times in 54 Posts
    Quote Originally Posted by roytam1 View Post
    sorted by Decompression Time over Compression Ratio list:
    A RLE implementation would win, if it was included.
    Also, subcompressors should detect L1/L2/L3 cache sizes and speeds, and perform accordingly.
    This should probably be called either subcompression or undercompression, when you try to compress as fast as possible, and end up with more than 200% of the best known losslessly-compressed size.
    Last edited by Alexander Rhatushnyak; 29th April 2011 at 17:23.

    This newsgroup is dedicated to image compression:
    http://linkedin.com/groups/Image-Compression-3363256

  6. #6
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    BENCHMARK 0.4. Changes: added crush, shrinker, yappy, and many programs are updated.
    Yappy is very fast, but unsafe of broken streams and many more. Congratulations to Cyan for great improvements of LZ4 (both compression and decompression speed).

    The results using 1 core of Intel Core i5-520M 2.4 GHz, Windows 8 (32-bit MinGW compilation under gcc 4.8.1) and 3 iterations. The input file (100 MB) is a concatenation of 10 different files, about 10 MB each: bmp, dct_coeffs, english_dic, ENWIK, exe, fp_log, hlp, XML, pdf, ncb.

    Code:
    memcpy              = 48 ms (2133 MB/s), 104854004->104854004
    crush 1.0 -cf       = 4373 ms (23 MB/s), 104854004->34318465, 474 ms (216 MB/s)
    fastlz 0.1 -1       = 534 ms (191 MB/s), 104854004->45614322, 232 ms (441 MB/s)
    fastlz 0.1 -2       = 530 ms (193 MB/s), 104854004->43986331, 249 ms (411 MB/s)
    lz4 rev 9           = 424 ms (241 MB/s), 104854004->44774336, 159 ms (644 MB/s)
    lz4 rev 10          = 349 ms (293 MB/s), 104854004->45520068, 154 ms (664 MB/s)
    lz4 rev 116 safe    = 284 ms (360 MB/s), 104854004->45274764, 92 ms (1113 MB/s)
    lz4 rev 116 fast    = 285 ms (359 MB/s), 104854004->45274764, 89 ms (1150 MB/s)
    lz4hc rev 116       = 6065 ms (16 MB/s), 104854004->35948473, 82 ms (1248 MB/s)
    lzf 3.6 vf          = 597 ms (171 MB/s), 104854004->44890314, 227 ms (451 MB/s)
    lzf 3.6 uf          = 586 ms (174 MB/s), 104854004->47089435, 234 ms (437 MB/s)
    lzjb 2010           = 624 ms (164 MB/s), 104854004->52693883, 317 ms (323 MB/s)
    lzmat 1.1           = 4741 ms (21 MB/s), 104854004->34419889, 373 ms (274 MB/s)
    lzo 2.06 1b_1       = 713 ms (143 MB/s), 104854004->43344892, 191 ms (536 MB/s)
    lzo 2.06 1b_9       = 1372 ms (74 MB/s), 104854004->39903850, 192 ms (533 MB/s)
    lzo 2.06 1b_99      = 1869 ms (54 MB/s), 104854004->38668219, 186 ms (550 MB/s)
    lzo 2.06 1c_1       = 699 ms (146 MB/s), 104854004->44096833, 209 ms (489 MB/s)
    lzo 2.06 1c_9       = 1505 ms (68 MB/s), 104854004->40537792, 201 ms (509 MB/s)
    lzo 2.06 1c_99      = 1812 ms (56 MB/s), 104854004->39492450, 196 ms (522 MB/s)
    lzo 2.06 1f_1       = 725 ms (141 MB/s), 104854004->44148573, 235 ms (435 MB/s)
    lzo 2.06 1x_1       = 294 ms (348 MB/s), 104854004->44682935, 222 ms (461 MB/s)
    lzo 2.06 1y_1       = 292 ms (350 MB/s), 104854004->44524272, 227 ms (451 MB/s)
    lzrw1               = 600 ms (170 MB/s), 104854004->51296084, 392 ms (261 MB/s)
    lzrw1-a             = 599 ms (170 MB/s), 104854004->50630870, 269 ms (380 MB/s)
    lzrw2               = 568 ms (180 MB/s), 104854004->47950899, 271 ms (377 MB/s)
    lzrw3               = 683 ms (149 MB/s), 104854004->46384103, 401 ms (255 MB/s)
    lzrw3-a             = 1373 ms (74 MB/s), 104854004->42580878, 423 ms (242 MB/s)
    shrinker            = 548 ms (186 MB/s), 104854004->41162640, 155 ms (660 MB/s)
    snappy 1.0.3        = 390 ms (262 MB/s), 104854004->46155676, 156 ms (656 MB/s)
    snappy 1.1.2        = 462 ms (221 MB/s), 104854004->44998050, 141 ms (726 MB/s)
    tornado 0.5 16k/1   = 541 ms (189 MB/s), 104854004->47432525, 357 ms (286 MB/s)
    tornado 128k/2m     = 575 ms (178 MB/s), 104854004->45166082, 360 ms (284 MB/s)
    tornado 128k/8m     = 590 ms (173 MB/s), 104854004->42299345, 359 ms (285 MB/s)
    tornado 4m/8m       = 1227 ms (83 MB/s), 104854004->38140549, 383 ms (267 MB/s)
    tornado b128k/8m    = 653 ms (156 MB/s), 104854004->37629695, 471 ms (217 MB/s)
    tornado b4m/8m      = 1316 ms (77 MB/s), 104854004->33769518, 484 ms (211 MB/s)
    tornado b4m/32m     = 1190 ms (86 MB/s), 104854004->29325608, 469 ms (218 MB/s)
    quicklz 1.5.0 -3    = 3160 ms (32 MB/s), 104854004->37633177, 160 ms (639 MB/s)
    quicklz 1.5.0 -2    = 814 ms (125 MB/s), 104854004->38965498, 407 ms (251 MB/s)
    quicklz 1.5.0 -1    = 383 ms (267 MB/s), 104854004->42816655, 351 ms (291 MB/s)
    quicklz 1.5.1 b7 -1 = 393 ms (260 MB/s), 104854004->42816655, 351 ms (291 MB/s)
    yappy 1             = 1783 ms (57 MB/s), 104854004->46825077, 80 ms (1279 MB/s)
    yappy 10            = 2195 ms (46 MB/s), 104854004->43628949, 73 ms (1402 MB/s)
    yappy 100           = 2888 ms (35 MB/s), 104854004->42780156, 69 ms (1484 MB/s)
    zlib 1.2.8 -1       = 2327 ms (44 MB/s), 104854004->35167222, 512 ms (199 MB/s)
    zlib 1.2.8 -6       = 5823 ms (17 MB/s), 104854004->31262824, 467 ms (219 MB/s)
    Last edited by inikep; 9th April 2014 at 17:21.

  7. Thanks:

    Cyan (9th April 2014)

  8. #7
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Attached are sources and Win32 executable of BENCHMARK 0.3. Changes: added zlib, ucl_nrv, LZ4 rev 10.

    New results using 1 core of Athlon X4 2.8 GHz, Windows 7 (32-bit MinGW compilation under gcc 4.5.2) and 3 iterations. The input file (100 MB) is a concatenation of 10 different files, about 10 MB each: bmp, dct_coeffs, english_dic, ENWIK, exe, fp_log, hlp, XML, pdf, ncb.

    Code:
    memcpy              = 53 ms (1932 MB/s), 104854004->104854004
    fastlz 0.1 -1       = 494 ms (207 MB/s), 104854004->45614322, 233 ms (439 MB/s)
    fastlz 0.1 -2       = 518 ms (197 MB/s), 104854004->43986331, 218 ms (469 MB/s)
    lz4 rev 9           = 562 ms (182 MB/s), 104854004->44774336, 140 ms (731 MB/s)
    lz4 rev 10          = 330 ms (310 MB/s), 104854004->45520068, 134 ms (764 MB/s)
    lzf 3.6 vf          = 538 ms (190 MB/s), 104854004->44890314, 206 ms (497 MB/s)
    lzf 3.6 uf          = 506 ms (202 MB/s), 104854004->47089435, 207 ms (494 MB/s)
    lzham alpha6 -m0d26 = 28538 ms (3 MB/s), 104854004->25810349, 1287 ms (79 MB/s)
    lzjb 2010           = 636 ms (161 MB/s), 104854004->52693883, 303 ms (337 MB/s)
    lzmat 1.1           = 5050 ms (20 MB/s), 104854004->34419889, 375 ms (273 MB/s)
    lzo 2.05 1b_1       = 835 ms (122 MB/s), 104854004->43344892, 192 ms (533 MB/s)
    lzo 2.05 1b_9       = 1419 ms (72 MB/s), 104854004->39903850, 197 ms (519 MB/s)
    lzo 2.05 1b_99      = 1804 ms (56 MB/s), 104854004->38668219, 190 ms (538 MB/s)
    lzo 2.05 1c_1       = 808 ms (126 MB/s), 104854004->44096833, 197 ms (519 MB/s)
    lzo 2.05 1c_9       = 1442 ms (71 MB/s), 104854004->40537792, 207 ms (494 MB/s)
    lzo 2.05 1c_99      = 1810 ms (56 MB/s), 104854004->39492450, 204 ms (501 MB/s)
    lzo 2.05 1f_1       = 851 ms (120 MB/s), 104854004->44148573, 198 ms (517 MB/s)
    lzo 2.05 1x_1       = 246 ms (416 MB/s), 104854004->51883722, 192 ms (533 MB/s)
    lzo 2.05 1y_1       = 244 ms (419 MB/s), 104854004->51797619, 188 ms (544 MB/s)
    lzo 2.05 1b_999     = 15653 ms (6 MB/s), 104854004->35342929, 173 ms (591 MB/s)
    lzo 2.05 1c_999     = 11824 ms (8 MB/s), 104854004->36695814, 190 ms (538 MB/s)
    lzo 2.05 1f_999     = 13560 ms (7 MB/s), 104854004->36641050, 190 ms (538 MB/s)
    lzo 2.05 1x_999     = 31334 ms (3 MB/s), 104854004->34535021, 207 ms (494 MB/s)
    lzo 2.05 1y_999     = 29581 ms (3 MB/s), 104854004->34204260, 206 ms (497 MB/s)
    lzo 2.05 1z_999     = 31428 ms (3 MB/s), 104854004->34224781, 223 ms (459 MB/s)
    lzo 2.05 2a_999     = 12228 ms (8 MB/s), 104854004->37624779, 267 ms (383 MB/s)
    lzrw1               = 592 ms (172 MB/s), 104854004->51296084, 274 ms (373 MB/s)
    lzrw1-a             = 582 ms (175 MB/s), 104854004->50630870, 286 ms (358 MB/s)
    lzrw2               = 531 ms (192 MB/s), 104854004->47950899, 391 ms (261 MB/s)
    lzrw3               = 514 ms (199 MB/s), 104854004->46384103, 431 ms (237 MB/s)
    lzrw3-a             = 1417 ms (72 MB/s), 104854004->42580878, 463 ms (221 MB/s)
    snappy 1.0.3        = 307 ms (333 MB/s), 104854004->46155676, 141 ms (726 MB/s)
    tornado 0.666 16k/1 = 528 ms (193 MB/s), 104854004->47432525, 354 ms (289 MB/s)
    tornado 128k/2m     = 695 ms (147 MB/s), 104854004->45166082, 368 ms (278 MB/s)
    tornado 128k/8m     = 693 ms (147 MB/s), 104854004->42299345, 355 ms (288 MB/s)
    tornado 4m/8m       = 1675 ms (61 MB/s), 104854004->38140549, 405 ms (252 MB/s)
    tornado b128k/8m    = 793 ms (129 MB/s), 104854004->37629695, 481 ms (212 MB/s)
    tornado b4m/8m      = 1770 ms (57 MB/s), 104854004->33769518, 526 ms (194 MB/s)
    tornado b4m/32m     = 1513 ms (67 MB/s), 104854004->29325608, 501 ms (204 MB/s)
    quicklz 1.5.0 -3    = 3985 ms (25 MB/s), 104854004->37633177, 146 ms (701 MB/s)
    quicklz 1.5.0 -2    = 1013 ms (101 MB/s), 104854004->38965498, 351 ms (291 MB/s)
    quicklz 1.5.0 -1    = 373 ms (274 MB/s), 104854004->42816655, 300 ms (341 MB/s)
    quicklz 1.5.1 b5 -1 = 366 ms (279 MB/s), 104854004->42816655, 298 ms (343 MB/s)
    ucl_nrv2b 1.03 -1   = 7569 ms (13 MB/s), 104854004->37105362, 515 ms (198 MB/s)
    ucl_nrv2b 1.03 -6   = 12809 ms (7 MB/s), 104854004->34133511, 458 ms (223 MB/s)
    ucl_nrv2d 1.03 -1   = 7696 ms (13 MB/s), 104854004->36944802, 500 ms (204 MB/s)
    ucl_nrv2d 1.03 -6   = 12618 ms (8 MB/s), 104854004->34061261, 450 ms (227 MB/s)
    ucl_nrv2e 1.03 -1   = 7628 ms (13 MB/s), 104854004->36805095, 496 ms (206 MB/s)
    ucl_nrv2e 1.03 -6   = 12642 ms (8 MB/s), 104854004->33836500, 439 ms (233 MB/s)
    zlib 1.2.5 -1       = 2187 ms (46 MB/s), 104854004->35167222, 514 ms (199 MB/s)
    zlib 1.2.5 -6       = 5799 ms (17 MB/s), 104854004->31262824, 473 ms (216 MB/s)
    zlib 1.2.5 -9       = 19415 ms (5 MB/s), 104854004->31051160, 469 ms (218 MB/s)
    all                 = 950597 ms
    The results using 1 core of Intel Xeon X5355 @ 2.66GHz (64-bit Linux compilation under gcc 4.4.4) with the same options and input:
    Code:
    memcpy              = 64 ms (1599 MB/s), 104854004->104854004
    fastlz 0.1 -1       = 524 ms (195 MB/s), 104854004->45614322, 262 ms (390 MB/s)
    fastlz 0.1 -2       = 537 ms (190 MB/s), 104854004->43986331, 251 ms (407 MB/s)
    lz4 rev 9           = 408 ms (250 MB/s), 104854004->44774336, 167 ms (613 MB/s)
    lz4 rev 10          = 366 ms (279 MB/s), 104854004->45520068, 162 ms (632 MB/s)
    lzf 3.6 vf          = 483 ms (212 MB/s), 104854004->44890314, 206 ms (497 MB/s)
    lzf 3.6 uf          = 478 ms (214 MB/s), 104854004->47089435, 210 ms (487 MB/s)
    lzham alpha6 -m0d26 = 24385 ms (4 MB/s), 104854004->25810349, 878 ms (116 MB/s)
    lzjb 2010           = 620 ms (165 MB/s), 104854004->52693883, 303 ms (337 MB/s)
    lzmat 1.1           = 5689 ms (17 MB/s), 104854004->34419889, 361 ms (283 MB/s)
    lzo 2.05 1b_1       = 774 ms (132 MB/s), 104854004->43344892, 214 ms (478 MB/s)
    lzo 2.05 1b_9       = 1344 ms (76 MB/s), 104854004->39903850, 214 ms (478 MB/s)
    lzo 2.05 1b_99      = 1767 ms (57 MB/s), 104854004->38668219, 209 ms (489 MB/s)
    lzo 2.05 1c_1       = 734 ms (139 MB/s), 104854004->44096833, 217 ms (471 MB/s)
    lzo 2.05 1c_9       = 1490 ms (68 MB/s), 104854004->40537792, 223 ms (459 MB/s)
    lzo 2.05 1c_99      = 1795 ms (57 MB/s), 104854004->39492450, 218 ms (469 MB/s)
    lzo 2.05 1f_1       = 795 ms (128 MB/s), 104854004->44148573, 223 ms (459 MB/s)
    lzo 2.05 1x_1       = 272 ms (376 MB/s), 104854004->51881393, 195 ms (525 MB/s)
    lzo 2.05 1y_1       = 270 ms (379 MB/s), 104854004->51795089, 197 ms (519 MB/s)
    lzo 2.05 1b_999     = 12071 ms (8 MB/s), 104854004->35342929, 189 ms (541 MB/s)
    lzo 2.05 1c_999     = 9560 ms (10 MB/s), 104854004->36695814, 202 ms (506 MB/s)
    lzo 2.05 1f_999     = 10897 ms (9 MB/s), 104854004->36641050, 212 ms (483 MB/s)
    lzo 2.05 1x_999     = 25986 ms (3 MB/s), 104854004->34535021, 204 ms (501 MB/s)
    lzo 2.05 1y_999     = 24712 ms (4 MB/s), 104854004->34204260, 210 ms (487 MB/s)
    lzo 2.05 1z_999     = 26061 ms (3 MB/s), 104854004->34224781, 210 ms (487 MB/s)
    lzo 2.05 2a_999     = 9532 ms (10 MB/s), 104854004->37624779, 272 ms (376 MB/s)
    lzrw1               = 596 ms (171 MB/s), 104854004->51296084, 327 ms (313 MB/s)
    lzrw1-a             = 595 ms (172 MB/s), 104854004->50630870, 288 ms (355 MB/s)
    lzrw2               = 510 ms (200 MB/s), 104854004->47950899, 311 ms (329 MB/s)
    lzrw3               = 537 ms (190 MB/s), 104854004->46384103, 359 ms (285 MB/s)
    lzrw3-a             = 1449 ms (70 MB/s), 104854004->42580878, 363 ms (282 MB/s)
    snappy 1.0.3        = 294 ms (348 MB/s), 104854004->46155676, 149 ms (687 MB/s)
    tornado 0.666 16k/1 = 539 ms (189 MB/s), 104854004->47432525, 376 ms (272 MB/s)
    tornado 128k/2m     = 597 ms (171 MB/s), 104854004->45166082, 383 ms (267 MB/s)
    tornado 128k/8m     = 602 ms (170 MB/s), 104854004->42299345, 377 ms (271 MB/s)
    tornado 4m/8m       = 1140 ms (89 MB/s), 104854004->38140549, 397 ms (257 MB/s)
    tornado b128k/8m    = 653 ms (156 MB/s), 104854004->37629695, 423 ms (242 MB/s)
    tornado b4m/8m      = 1223 ms (83 MB/s), 104854004->33769518, 434 ms (235 MB/s)
    tornado b4m/32m     = 1101 ms (93 MB/s), 104854004->29325608, 423 ms (242 MB/s)
    quicklz 1.5.0 -3    = 3281 ms (31 MB/s), 104854004->37633177, 201 ms (509 MB/s)
    quicklz 1.5.0 -2    = 896 ms (114 MB/s), 104854004->38965498, 434 ms (235 MB/s)
    quicklz 1.5.0 -1    = 369 ms (277 MB/s), 104854004->42816655, 385 ms (265 MB/s)
    quicklz 1.5.1 b5 -1 = 332 ms (308 MB/s), 104854004->42816655, 381 ms (268 MB/s)
    ucl_nrv2b 1.03 -1   = 4413 ms (23 MB/s), 104854004->37105362, 576 ms (177 MB/s)
    ucl_nrv2b 1.03 -6   = 8697 ms (11 MB/s), 104854004->34133511, 518 ms (197 MB/s)
    ucl_nrv2d 1.03 -1   = 4378 ms (23 MB/s), 104854004->36944802, 564 ms (181 MB/s)
    ucl_nrv2d 1.03 -6   = 8569 ms (11 MB/s), 104854004->34061261, 507 ms (201 MB/s)
    ucl_nrv2e 1.03 -1   = 4466 ms (22 MB/s), 104854004->36805095, 562 ms (182 MB/s)
    ucl_nrv2e 1.03 -6   = 8703 ms (11 MB/s), 104854004->33836500, 504 ms (203 MB/s)
    zlib 1.2.5 -1       = 2178 ms (47 MB/s), 104854004->35167222, 489 ms (209 MB/s)
    zlib 1.2.5 -6       = 5808 ms (17 MB/s), 104854004->31262824, 445 ms (230 MB/s)
    zlib 1.2.5 -9       = 17902 ms (5 MB/s), 104854004->31051160, 439 ms (233 MB/s)
    all                 = 786724 ms
    Attached Files Attached Files
    Last edited by inikep; 6th June 2011 at 19:09.

  9. #8
    Member lz77's Avatar
    Join Date
    Jan 2016
    Location
    Russia
    Posts
    176
    Thanks
    60
    Thanked 16 Times in 12 Posts
    Quote Originally Posted by inikep View Post
    The input file (100 MB) is a concatenation of 10 different files, about 10 MB each: bmp, dct_coeffs, english_dic, ENWIK, exe, fp_log, hlp, XML, pdf, ncb.
    To get highest results you should "warm" the core before benchmark
    Code:
    SetProcessAffinityMask(GetCurrentProcess(), 1);
    	volatile int zomg = 1;
    	for ( int i=1; i<1000000000; i++ )
    		zomg *= i;
    Read about this https://habrahabr.ru/post/113682/ (sorry, in Russian).
    Could you please upload your input file? I want to download it to test my LZ77 examples, thanks

    Serge

  10. Thanks (2):

    Bulat Ziganshin (12th August 2016),ne0n (27th June 2016)

  11. #9
    Member
    Join Date
    Jan 2014
    Location
    Bothell, Washington, USA
    Posts
    697
    Thanks
    154
    Thanked 186 Times in 109 Posts
    Hi Inikep,

    I really like lzbench. It is a very nice tool for comparing LZ compression performance.

    Could you possibly add GLZA? Modified lzbench and new GLZA source code is attached. Also, I wasn't able to get the "-eall" option to test more than one token's worth of files, so beware I put in a hack to get around that problem.

    Here's a sample of the top 20 results on 1musk10.txt, sorted by compression ratio:

    Code:
    The results sorted by column number 5:
    Compressor name         Compress. Decompress. Compr. size  Ratio Filename
    glza 0.7.1               0.16 MB/s    25 MB/s      309745  23.03 1musk10.txt
    csc 3.3 -5               3.27 MB/s    35 MB/s      375672  27.94 1musk10.txt
    brotli 0.4.0 -11         0.37 MB/s   234 MB/s      382317  28.43 1musk10.txt
    lzlib 1.7 -6             1.51 MB/s    39 MB/s      385331  28.65 1musk10.txt
    lzlib 1.7 -9             1.54 MB/s    39 MB/s      386360  28.73 1musk10.txt
    lzma 9.38 -5             1.58 MB/s    59 MB/s      386490  28.74 1musk10.txt
    csc 3.3 -3               4.17 MB/s    34 MB/s      386589  28.75 1musk10.txt
    xz 5.2.2 -6              1.70 MB/s    54 MB/s      386689  28.76 1musk10.txt
    xz 5.2.2 -9              1.65 MB/s    53 MB/s      386689  28.76 1musk10.txt
    zstd 0.8.0 -22           1.82 MB/s   397 MB/s      392000  29.15 1musk10.txt
    zstd 0.8.0 -18           2.22 MB/s   404 MB/s      392716  29.20 1musk10.txt
    tornado 0.6a -16         1.81 MB/s   124 MB/s      392901  29.22 1musk10.txt
    tornado 0.6a -13         3.38 MB/s   116 MB/s      411640  30.61 1musk10.txt
    zstd 0.8.0 -15           3.49 MB/s   412 MB/s      413727  30.77 1musk10.txt
    lzham 1.0 -d26 -1        1.58 MB/s   140 MB/s      417559  31.05 1musk10.txt
    zling 2016-01-10 -4        20 MB/s    83 MB/s      417749  31.07 1musk10.txt
    csc 3.3 -1                 11 MB/s    34 MB/s      418998  31.16 1musk10.txt
    zling 2016-01-10 -3        23 MB/s    83 MB/s      421626  31.35 1musk10.txt
    tornado 0.6a -10         4.53 MB/s   121 MB/s      421810  31.37 1musk10.txt
    brotli 0.4.0 -8          5.93 MB/s   244 MB/s      425235  31.62 1musk10.txt
    Edit: I fixed the table and replace the code because I found a memory initialization problem that shows up when a file is repeatedly decoded. There seems to be another bug, glza runs fine but on files over 1 - 3 MB it causes the next program to crash. The cause is not obvious; I checked memory pointers and they look okay but the workmem address seems to be larger. I don't understand much of lzbench code, so I'm not sure what is going on.
    Last edited by Kennon Conrad; 14th August 2016 at 22:36.

  12. Thanks:

    Stephan Busch (12th August 2016)

  13. #10
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    New results from Athlon X4 2.8 GHz (32-bit compilation under gcc 4.5.2 (MinGW)), 3 iterations, added LZHAM, LZMAT, LZO, tornado:
    Code:
    memcpy              = 52 ms (1969 MB/s), 104854004->104854004
    fastlz 0.1 -1       = 493 ms (207 MB/s), 104854004->45614322, 228 ms (449 MB/s)
    fastlz 0.1 -2       = 513 ms (199 MB/s), 104854004->43986331, 224 ms (457 MB/s)
    LZ4                 = 520 ms (196 MB/s), 104854004->44774337, 150 ms (682 MB/s)
    lzf 3.6 vf          = 556 ms (184 MB/s), 104854004->44890314, 204 ms (501 MB/s)
    lzf 3.6 uf          = 515 ms (198 MB/s), 104854004->47089435, 206 ms (497 MB/s)
    lzham alpha6 -m0d26 = 28639 ms (3 MB/s), 104854004->25810349, 1309 ms (78 MB/s)
    lzjb 2010           = 633 ms (161 MB/s), 104854004->52693883, 328 ms (312 MB/s)
    lzmat 1.1           = 5029 ms (20 MB/s), 104854004->34419889, 488 ms (209 MB/s)
    lzo 2.05 1          = 735 ms (139 MB/s), 104854004->46753759, 220 ms (465 MB/s)
    lzo 2.05 1_99       = 1798 ms (56 MB/s), 104854004->41866399, 213 ms (480 MB/s)
    lzo 2.05 1a         = 697 ms (146 MB/s), 104854004->46123353, 210 ms (487 MB/s)
    lzo 2.05 1a_99      = 2011 ms (50 MB/s), 104854004->41359708, 209 ms (489 MB/s)
    lzo 2.05 1b_1       = 836 ms (122 MB/s), 104854004->43344892, 191 ms (536 MB/s)
    lzo 2.05 1b_9       = 1398 ms (73 MB/s), 104854004->39903850, 198 ms (517 MB/s)
    lzo 2.05 1b_99      = 1792 ms (57 MB/s), 104854004->38668219, 191 ms (536 MB/s)
    lzo 2.05 1b_999     = 15865 ms (6 MB/s), 104854004->35342929, 172 ms (595 MB/s)
    lzo 2.05 1c_1       = 830 ms (123 MB/s), 104854004->44096833, 196 ms (522 MB/s)
    lzo 2.05 1c_9       = 1430 ms (71 MB/s), 104854004->40537792, 205 ms (499 MB/s)
    lzo 2.05 1c_99      = 1839 ms (55 MB/s), 104854004->39492450, 201 ms (509 MB/s)
    lzo 2.05 1c_999     = 12399 ms (8 MB/s), 104854004->36695814, 191 ms (536 MB/s)
    lzo 2.05 1f_1       = 841 ms (121 MB/s), 104854004->44148573, 199 ms (514 MB/s)
    lzo 2.05 1f_999     = 14112 ms (7 MB/s), 104854004->36641050, 191 ms (536 MB/s)
    lzo 2.05 1x_1       = 244 ms (419 MB/s), 104854004->51883722, 192 ms (533 MB/s)
    lzo 2.05 1x_999     = 32161 ms (3 MB/s), 104854004->34535021, 206 ms (497 MB/s)
    lzo 2.05 1y_1       = 242 ms (423 MB/s), 104854004->51797619, 188 ms (544 MB/s)
    lzo 2.05 1y_999     = 30562 ms (3 MB/s), 104854004->34204260, 206 ms (497 MB/s)
    lzo 2.05 1z_999     = 32317 ms (3 MB/s), 104854004->34224781, 221 ms (463 MB/s)
    lzo 2.05 2a_999     = 14087 ms (7 MB/s), 104854004->37624779, 256 ms (399 MB/s)
    lzrw1               = 577 ms (177 MB/s), 104854004->51296084, 282 ms (363 MB/s)
    lzrw1-a             = 554 ms (184 MB/s), 104854004->50630870, 293 ms (349 MB/s)
    lzrw2               = 560 ms (182 MB/s), 104854004->47950899, 389 ms (263 MB/s)
    lzrw3               = 601 ms (170 MB/s), 104854004->46384103, 435 ms (235 MB/s)
    lzrw3-a             = 1413 ms (72 MB/s), 104854004->42580878, 467 ms (219 MB/s)
    snappy 1.0          = 311 ms (329 MB/s), 104854004->46155676, 150 ms (682 MB/s)
    tornado 0.666 16k/1 = 537 ms (190 MB/s), 104854004->47432525, 356 ms (287 MB/s)
    tornado 128k/2m     = 695 ms (147 MB/s), 104854004->45166082, 366 ms (279 MB/s)
    tornado 128k/8m     = 696 ms (147 MB/s), 104854004->42299345, 356 ms (287 MB/s)
    tornado 4m/8m       = 1760 ms (58 MB/s), 104854004->38140549, 405 ms (252 MB/s)
    tornado b128k/8m    = 790 ms (129 MB/s), 104854004->37629695, 481 ms (212 MB/s)
    tornado b4m/8m      = 1814 ms (56 MB/s), 104854004->33769518, 522 ms (196 MB/s)
    tornado b4m/32m     = 1582 ms (64 MB/s), 104854004->29325608, 509 ms (201 MB/s)
    quicklz 1.5.0 -3    = 4731 ms (21 MB/s), 104854004->37633177, 147 ms (696 MB/s)
    quicklz 1.5.0 -2    = 1052 ms (97 MB/s), 104854004->38965498, 353 ms (290 MB/s)
    quicklz 1.5.0 -1    = 389 ms (263 MB/s), 104854004->42816655, 301 ms (340 MB/s)
    quicklz 1.5.1 b5 -1 = 349 ms (293 MB/s), 104854004->42816655, 300 ms (341 MB/s)

  14. #11
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by inikep View Post
    New results from Athlon X4 2.8 GHz (32-bit compilation under gcc 4.5.2 (MinGW)), 3 iterations, added LZHAM, LZMAT, LZO, tornado:
    Code:
    memcpy              = 52 ms (1969 MB/s), 104854004->104854004
    fastlz 0.1 -1       = 493 ms (207 MB/s), 104854004->45614322, 228 ms (449 MB/s)
    fastlz 0.1 -2       = 513 ms (199 MB/s), 104854004->43986331, 224 ms (457 MB/s)
    LZ4                 = 520 ms (196 MB/s), 104854004->44774337, 150 ms (682 MB/s)
    lzf 3.6 vf          = 556 ms (184 MB/s), 104854004->44890314, 204 ms (501 MB/s)
    lzf 3.6 uf          = 515 ms (198 MB/s), 104854004->47089435, 206 ms (497 MB/s)
    lzham alpha6 -m0d26 = 28639 ms (3 MB/s), 104854004->25810349, 1309 ms (78 MB/s)
    lzjb 2010           = 633 ms (161 MB/s), 104854004->52693883, 328 ms (312 MB/s)
    lzmat 1.1           = 5029 ms (20 MB/s), 104854004->34419889, 488 ms (209 MB/s)
    lzo 2.05 1          = 735 ms (139 MB/s), 104854004->46753759, 220 ms (465 MB/s)
    lzo 2.05 1_99       = 1798 ms (56 MB/s), 104854004->41866399, 213 ms (480 MB/s)
    lzo 2.05 1a         = 697 ms (146 MB/s), 104854004->46123353, 210 ms (487 MB/s)
    lzo 2.05 1a_99      = 2011 ms (50 MB/s), 104854004->41359708, 209 ms (489 MB/s)
    lzo 2.05 1b_1       = 836 ms (122 MB/s), 104854004->43344892, 191 ms (536 MB/s)
    lzo 2.05 1b_9       = 1398 ms (73 MB/s), 104854004->39903850, 198 ms (517 MB/s)
    lzo 2.05 1b_99      = 1792 ms (57 MB/s), 104854004->38668219, 191 ms (536 MB/s)
    lzo 2.05 1b_999     = 15865 ms (6 MB/s), 104854004->35342929, 172 ms (595 MB/s)
    lzo 2.05 1c_1       = 830 ms (123 MB/s), 104854004->44096833, 196 ms (522 MB/s)
    lzo 2.05 1c_9       = 1430 ms (71 MB/s), 104854004->40537792, 205 ms (499 MB/s)
    lzo 2.05 1c_99      = 1839 ms (55 MB/s), 104854004->39492450, 201 ms (509 MB/s)
    lzo 2.05 1c_999     = 12399 ms (8 MB/s), 104854004->36695814, 191 ms (536 MB/s)
    lzo 2.05 1f_1       = 841 ms (121 MB/s), 104854004->44148573, 199 ms (514 MB/s)
    lzo 2.05 1f_999     = 14112 ms (7 MB/s), 104854004->36641050, 191 ms (536 MB/s)
    lzo 2.05 1x_1       = 244 ms (419 MB/s), 104854004->51883722, 192 ms (533 MB/s)
    lzo 2.05 1x_999     = 32161 ms (3 MB/s), 104854004->34535021, 206 ms (497 MB/s)
    lzo 2.05 1y_1       = 242 ms (423 MB/s), 104854004->51797619, 188 ms (544 MB/s)
    lzo 2.05 1y_999     = 30562 ms (3 MB/s), 104854004->34204260, 206 ms (497 MB/s)
    lzo 2.05 1z_999     = 32317 ms (3 MB/s), 104854004->34224781, 221 ms (463 MB/s)
    lzo 2.05 2a_999     = 14087 ms (7 MB/s), 104854004->37624779, 256 ms (399 MB/s)
    lzrw1               = 577 ms (177 MB/s), 104854004->51296084, 282 ms (363 MB/s)
    lzrw1-a             = 554 ms (184 MB/s), 104854004->50630870, 293 ms (349 MB/s)
    lzrw2               = 560 ms (182 MB/s), 104854004->47950899, 389 ms (263 MB/s)
    lzrw3               = 601 ms (170 MB/s), 104854004->46384103, 435 ms (235 MB/s)
    lzrw3-a             = 1413 ms (72 MB/s), 104854004->42580878, 467 ms (219 MB/s)
    snappy 1.0          = 311 ms (329 MB/s), 104854004->46155676, 150 ms (682 MB/s)
    tornado 0.666 16k/1 = 537 ms (190 MB/s), 104854004->47432525, 356 ms (287 MB/s)
    tornado 128k/2m     = 695 ms (147 MB/s), 104854004->45166082, 366 ms (279 MB/s)
    tornado 128k/8m     = 696 ms (147 MB/s), 104854004->42299345, 356 ms (287 MB/s)
    tornado 4m/8m       = 1760 ms (58 MB/s), 104854004->38140549, 405 ms (252 MB/s)
    tornado b128k/8m    = 790 ms (129 MB/s), 104854004->37629695, 481 ms (212 MB/s)
    tornado b4m/8m      = 1814 ms (56 MB/s), 104854004->33769518, 522 ms (196 MB/s)
    tornado b4m/32m     = 1582 ms (64 MB/s), 104854004->29325608, 509 ms (201 MB/s)
    quicklz 1.5.0 -3    = 4731 ms (21 MB/s), 104854004->37633177, 147 ms (696 MB/s)
    quicklz 1.5.0 -2    = 1052 ms (97 MB/s), 104854004->38965498, 353 ms (290 MB/s)
    quicklz 1.5.0 -1    = 389 ms (263 MB/s), 104854004->42816655, 301 ms (340 MB/s)
    quicklz 1.5.1 b5 -1 = 349 ms (293 MB/s), 104854004->42816655, 300 ms (341 MB/s)
    What -c parameter of Tornado did you use?

  15. #12
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,505
    Thanks
    26
    Thanked 136 Times in 104 Posts
    CPU usage please

  16. #13
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    100% CPU usage of 1 core

  17. #14
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,023
    Thanks
    415
    Thanked 416 Times in 158 Posts
    crush?

  18. #15
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    it looks like he was tested -1..-7 modes, just provided only part of their specs

  19. #16
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Quote Originally Posted by encode View Post
    crush?
    sources?

    Quote Originally Posted by Bulat Ziganshin View Post
    it looks like he was tested -1..-7 modes, just provided only part of their specs
    No, these are my options. I was interested in fast compression and very fast decompression. The options are:

    // num coder_type tables row hashsize caching buffer parser hash3 shift update auxhash
    , { 1, BYTECODER, false, 1, 16*kb, 0, 1*mb, GREEDY, 0, 0, 999, 0, 0 }
    , { 2, BYTECODER, false, 1, 128*kb, 0, 2*mb, GREEDY, 0, 0, 999, 0, 0 }
    , { 3, BYTECODER, false, 1, 128*kb, 0, 8*mb, GREEDY, 0, 0, 999, 0, 0 }
    , { 4, BYTECODER, false, 1, 4*mb, 0, 8*mb, GREEDY, 0, 0, 999, 0, 0 }
    , { 5, BITCODER, false, 1, 128*kb, 0, 8*mb, GREEDY, 0, 0, 999, 0, 0 }
    , { 6, BITCODER, false, 1, 4*mb, 0, 8*mb, GREEDY, 0, 0, 999, 0, 0 }
    , { 7, BITCODER, false, 1, 4*mb, 0, 32*mb, GREEDY, 0, 0, 999, 0, 0 }
    Last edited by inikep; 31st May 2011 at 00:08.

  20. #17
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    Pareto frontier:
    Code:
    lzham alpha6 -m0d26 = 28639 ms (3 MB/s), 104854004->25810349, 1309 ms (78 MB/s)
    tornado b4m/32m     = 1582 ms (64 MB/s), 104854004->29325608, 509 ms (201 MB/s)
    lzo 2.05 1y_999     = 30562 ms (3 MB/s), 104854004->34204260, 206 ms (497 MB/s)
    lzo 2.05 1b_999     = 15865 ms (6 MB/s), 104854004->35342929, 172 ms (595 MB/s)
    quicklz 1.5.0 -3    = 4731 ms (21 MB/s), 104854004->37633177, 147 ms (696 MB/s)

  21. #18
    Member
    Join Date
    Mar 2011
    Location
    Google Switzerland
    Posts
    19
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Pareto frontier:
    Code:
    lzham alpha6 -m0d26 = 28639 ms (3 MB/s), 104854004->25810349, 1309 ms (78 MB/s)
    tornado b4m/32m     = 1582 ms (64 MB/s), 104854004->29325608, 509 ms (201 MB/s)
    lzo 2.05 1y_999     = 30562 ms (3 MB/s), 104854004->34204260, 206 ms (497 MB/s)
    lzo 2.05 1b_999     = 15865 ms (6 MB/s), 104854004->35342929, 172 ms (595 MB/s)
    quicklz 1.5.0 -3    = 4731 ms (21 MB/s), 104854004->37633177, 147 ms (696 MB/s)
    Hi,

    We just released Snappy 1.0.3, which has some speed improvements in the decompressor; you may want to take it for a spin and see if it pushes the frontier. (By the way, it should be noted that this is the ratio/decompression time frontier only; you could imagine many others.)

    /* Steinar */

  22. #19
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Good work Sesse:
    Code:
    snappy 1.0.3        = 309 ms (331 MB/s), 104854004->46155676, 141 ms (726 MB/s)

  23. #20
    Member
    Join Date
    Mar 2011
    Location
    Google Switzerland
    Posts
    19
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by inikep View Post
    Good work Sesse:
    Code:
    snappy 1.0.3        = 309 ms (331 MB/s), 104854004->46155676, 141 ms (726 MB/s)
    There are some more changes in Subversion now (post-1.0.3); they may or may not give you further decompression speed improvements. 5-10%, maybe, but it will definitely depend on the data, the compiler and the machine.

    In any case, a tighter Snappy compressor would make the decompression faster, if that's your primary goal.

    /* Steinar */

  24. #21
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,505
    Thanks
    26
    Thanked 136 Times in 104 Posts
    I have some uncomfortable (?) question

    How many Googlers work on snappy?
    Last edited by Piotr Tarsa; 3rd June 2011 at 04:14. Reason: Corrected as requested.

  25. #22
    Member
    Join Date
    May 2007
    Location
    Poland
    Posts
    91
    Thanks
    10
    Thanked 4 Times in 4 Posts
    Just a language correction, I think you meant 'delicate', 'uncomfortable' question. Dictionaries suck

  26. #23
    Member
    Join Date
    Mar 2011
    Location
    Google Switzerland
    Posts
    19
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Piotr Tarsa View Post
    I have some cumbersome question

    How many Googlers work on snappy?
    None. All work is done on a volunteer basis, mostly in so-called "20% time". (You can see the changelog on the Google Code page; all changes are eventually reflected out there.)

    /* Steinar */

  27. #24
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,505
    Thanks
    26
    Thanked 136 Times in 104 Posts
    Weird. I thought Google uses Snappy (or something like that) as a part of a backend of their BigTable implementation.

  28. #25
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    New results using 1 core of Intel Xeon X5355 @ 2.66GHz (64-bit Linux compilation under gcc 4.4.4), 3 iterations, LZ4 removed as it gave segfault:
    Code:
    benchmark 0.2 (c) Dell Inc.  Written by P.Skibinski
    memcpy              = 64 ms (1599 MB/s), 104854004->104854004
    fastlz 0.1 -1       = 530 ms (193 MB/s), 104854004->45614322, 263 ms (389 MB/s)
    fastlz 0.1 -2       = 537 ms (190 MB/s), 104854004->43986331, 251 ms (407 MB/s)
    lzf 3.6 vf          = 482 ms (212 MB/s), 104854004->44890314, 208 ms (492 MB/s)
    lzf 3.6 uf          = 479 ms (213 MB/s), 104854004->47089435, 211 ms (485 MB/s)
    lzham alpha6 -m0d26 = 24419 ms (4 MB/s), 104854004->25810349, 879 ms (116 MB/s)
    lzjb 2010           = 619 ms (165 MB/s), 104854004->52693883, 296 ms (345 MB/s)
    lzmat 1.1           = 5801 ms (17 MB/s), 104854004->34419889, 370 ms (276 MB/s)
    lzo 2.05 1b_1       = 778 ms (131 MB/s), 104854004->43344892, 213 ms (480 MB/s)
    lzo 2.05 1b_9       = 1387 ms (73 MB/s), 104854004->39903850, 214 ms (478 MB/s)
    lzo 2.05 1b_99      = 1686 ms (60 MB/s), 104854004->38668219, 208 ms (492 MB/s)
    lzo 2.05 1c_1       = 734 ms (139 MB/s), 104854004->44096833, 216 ms (474 MB/s)
    lzo 2.05 1c_9       = 1510 ms (67 MB/s), 104854004->40537792, 222 ms (461 MB/s)
    lzo 2.05 1c_99      = 1827 ms (56 MB/s), 104854004->39492450, 217 ms (471 MB/s)
    lzo 2.05 1f_1       = 795 ms (128 MB/s), 104854004->44148573, 223 ms (459 MB/s)
    lzo 2.05 1x_1       = 274 ms (373 MB/s), 104854004->51881393, 194 ms (527 MB/s)
    lzo 2.05 1y_1       = 271 ms (377 MB/s), 104854004->51795089, 197 ms (519 MB/s)
    lzo 2.05 1b_999     = 12225 ms (8 MB/s), 104854004->35342929, 188 ms (544 MB/s)
    lzo 2.05 1c_999     = 10134 ms (10 MB/s), 104854004->36695814, 202 ms (506 MB/s)
    lzo 2.05 1f_999     = 11501 ms (8 MB/s), 104854004->36641050, 212 ms (483 MB/s)
    lzo 2.05 1x_999     = 26332 ms (3 MB/s), 104854004->34535021, 204 ms (501 MB/s)
    lzo 2.05 1y_999     = 25478 ms (4 MB/s), 104854004->34204260, 210 ms (487 MB/s)
    lzo 2.05 1z_999     = 26821 ms (3 MB/s), 104854004->34224781, 209 ms (489 MB/s)
    lzo 2.05 2a_999     = 10259 ms (9 MB/s), 104854004->37624779, 272 ms (376 MB/s)
    lzrw1               = 605 ms (169 MB/s), 104854004->51296084, 332 ms (308 MB/s)
    lzrw1-a             = 583 ms (175 MB/s), 104854004->50630870, 289 ms (354 MB/s)
    lzrw2               = 519 ms (197 MB/s), 104854004->47950899, 310 ms (330 MB/s)
    lzrw3               = 522 ms (196 MB/s), 104854004->46384103, 359 ms (285 MB/s)
    lzrw3-a             = 1447 ms (70 MB/s), 104854004->42580878, 362 ms (282 MB/s)
    snappy 1.0.3        = 296 ms (345 MB/s), 104854004->46155676, 149 ms (687 MB/s)
    tornado 0.666 16k/1 = 541 ms (189 MB/s), 104854004->47432525, 379 ms (270 MB/s)
    tornado 128k/2m     = 602 ms (170 MB/s), 104854004->45166082, 385 ms (265 MB/s)
    tornado 128k/8m     = 600 ms (170 MB/s), 104854004->42299345, 375 ms (273 MB/s)
    tornado 4m/8m       = 1092 ms (93 MB/s), 104854004->38140549, 394 ms (259 MB/s)
    tornado b128k/8m    = 650 ms (157 MB/s), 104854004->37629695, 420 ms (243 MB/s)
    tornado b4m/8m      = 1167 ms (87 MB/s), 104854004->33769518, 430 ms (238 MB/s)
    tornado b4m/32m     = 1087 ms (94 MB/s), 104854004->29325608, 420 ms (243 MB/s)
    quicklz 1.5.0 -3    = 3543 ms (28 MB/s), 104854004->37633177, 201 ms (509 MB/s)
    quicklz 1.5.0 -2    = 891 ms (114 MB/s), 104854004->38965498, 434 ms (235 MB/s)
    quicklz 1.5.0 -1    = 374 ms (273 MB/s), 104854004->42816655, 383 ms (267 MB/s)
    quicklz 1.5.1 b5 -1 = 337 ms (303 MB/s), 104854004->42816655, 382 ms (268 MB/s)
    all                 = 584391 ms
    If you want to make experiments by yourself attached are sources and Win32 executable.
    Attached Files Attached Files
    Last edited by inikep; 3rd June 2011 at 14:37.

  29. #26
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Tested on Atom Z530 (the 1st Atom series, 1.6 ghz), Lubuntu 11.04, gcc 4.5.2, Silesia Compression Corpus
    Code:
    memcpy              = 349 ms (593 MB/s), 211957760->211957760
    fastlz 0.1 -1       = 3166 ms (65 MB/s), 211957760->104628335, 1867 ms (110 MB/s)
    fastlz 0.1 -2       = 3514 ms (58 MB/s), 211957760->100906274, 1626 ms (127 MB/s)
    LZ4                 = 5075 ms (40 MB/s), 211957760->98040200, 1011 ms (204 MB/s)
    lzf 3.6 vf          = 4355 ms (47 MB/s), 211957760->102041451, 1162 ms (178 MB/s)
    lzf 3.6 uf          = 3754 ms (55 MB/s), 211957760->105682277, 1168 ms (177 MB/s)
    lzjb 2010           = 3503 ms (59 MB/s), 211957760->122672025, 1826 ms (113 MB/s)
    lzmat 1.1           = 44845 ms (4 MB/s), 211957760->76486345, 2966 ms (69 MB/s)
    lzo 2.05 1b_1       = 5531 ms (37 MB/s), 211957760->97035718, 1232 ms (168 MB/s)
    lzo 2.05 1b_9       = 9245 ms (22 MB/s), 211957760->89264755, 1223 ms (169 MB/s)
    lzo 2.05 1b_99      = 12670 ms (16 MB/s), 211957760->85656302, 1207 ms (171 MB/s)
    lzo 2.05 1c_1       = 4642 ms (44 MB/s), 211957760->99551843, 1209 ms (171 MB/s)
    lzo 2.05 1c_9       = 9673 ms (21 MB/s), 211957760->91037796, 1217 ms (170 MB/s)
    lzo 2.05 1c_99      = 13055 ms (15 MB/s), 211957760->88118081, 1214 ms (170 MB/s)
    lzo 2.05 1f_1       = 5287 ms (39 MB/s), 211957760->99742137, 1227 ms (168 MB/s)
    lzo 2.05 1x_1       = 627 ms (330 MB/s), 211957760->208714918, 319 ms (648 MB/s)
    lzo 2.05 1y_1       = 631 ms (328 MB/s), 211957760->208714002, 323 ms (640 MB/s)
    lzrw1               = 3372 ms (61 MB/s), 211957760->113763206, 1654 ms (125 MB/s)
    lzrw1-a             = 3218 ms (64 MB/s), 211957760->112345946, 1616 ms (128 MB/s)
    lzrw2               = 3762 ms (55 MB/s), 211957760->105430138, 2092 ms (98 MB/s)
    lzrw3               = 3307 ms (62 MB/s), 211957760->100136247, 2379 ms (87 MB/s)
    lzrw3-a             = 9936 ms (20 MB/s), 211957760->90810520, 2472 ms (83 MB/s)
    snappy 1.0.3        = 3461 ms (59 MB/s), 211957760->104755703, 1010 ms (204 MB/s)
    tornado 0.666 16k/1 = 3497 ms (59 MB/s), 211957760->107391267, 1852 ms (111 MB/s)
    tornado 128k/2m     = 5058 ms (40 MB/s), 211957760->98475697, 2050 ms (100 MB/s)
    tornado 128k/8m     = 5256 ms (39 MB/s), 211957760->98082873, 2100 ms (98 MB/s)
    tornado 4m/8m       = 10962 ms (18 MB/s), 211957760->96103939, 2706 ms (76 MB/s)
    tornado b128k/8m    = 6093 ms (33 MB/s), 211957760->88062062, 3260 ms (63 MB/s)
    tornado b4m/8m      = 11699 ms (17 MB/s), 211957760->85838678, 3859 ms (53 MB/s)
    tornado b4m/32m     = 12286 ms (16 MB/s), 211957760->86000929, 3974 ms (52 MB/s)
    quicklz 1.5.0 -3    = 31374 ms (6 MB/s), 211957760->81822726, 1008 ms (205 MB/s)
    quicklz 1.5.0 -2    = 6380 ms (32 MB/s), 211957760->84554401, 2312 ms (89 MB/s)
    quicklz 1.5.0 -1    = 2356 ms (87 MB/s), 211957760->94724661, 1803 ms (114 MB/s)
    quicklz 1.5.1 b5 -1 = 2292 ms (90 MB/s), 211957760->94724661, 1802 ms (114 MB/s)
    all                 = 964956 ms (0 MB/s), 0->0
    done... (3 iterations)
    I don't understand makefile syntax, but weirdly it got to Windows branch.

  30. #27
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Redid the test with -s switch.
    Code:
    memcpy              = 330 ms (627 MB/s), 211957760->211957760
    fastlz 0.1 -1       = 3153 ms (65 MB/s), 211957760->104628335, 1757 ms (117 MB/s)
    fastlz 0.1 -2       = 3483 ms (59 MB/s), 211957760->100906274, 1611 ms (128 MB/s)
    LZ4                 = 5189 ms (39 MB/s), 211957760->98040200, 1002 ms (206 MB/s)
    lzf 3.6 vf          = 4082 ms (50 MB/s), 211957760->102041451, 1155 ms (179 MB/s)
    lzf 3.6 uf          = 3544 ms (58 MB/s), 211957760->105682277, 1151 ms (179 MB/s)
    lzham alpha6 -m0d26 = 260575 ms (0 MB/s), 211957760->64043081, 23558 ms (8 MB/s)
    lzjb 2010           = 4928 ms (42 MB/s), 211957760->122672025, 1799 ms (115 MB/s)
    lzmat 1.1           = 44318 ms (4 MB/s), 211957760->76486345, 2961 ms (69 MB/s)
    lzo 2.05 1b_1       = 4912 ms (42 MB/s), 211957760->97035718, 1226 ms (168 MB/s)
    lzo 2.05 1b_9       = 9234 ms (22 MB/s), 211957760->89264755, 1215 ms (170 MB/s)
    lzo 2.05 1b_99      = 12652 ms (16 MB/s), 211957760->85656302, 1203 ms (172 MB/s)
    lzo 2.05 1c_1       = 4630 ms (44 MB/s), 211957760->99551843, 1202 ms (172 MB/s)
    lzo 2.05 1c_9       = 9665 ms (21 MB/s), 211957760->91037796, 1211 ms (170 MB/s)
    lzo 2.05 1c_99      = 13034 ms (15 MB/s), 211957760->88118081, 1197 ms (172 MB/s)
    lzo 2.05 1f_1       = 5204 ms (39 MB/s), 211957760->99742137, 1219 ms (169 MB/s)
    lzo 2.05 1x_1       = 3462 ms (59 MB/s), 211957760->208714918, 334 ms (619 MB/s)
    lzo 2.05 1y_1       = 628 ms (329 MB/s), 211957760->208714002, 337 ms (614 MB/s)
    lzo 2.05 1b_999     = 141919 ms (1 MB/s), 211957760->76594616, 1111 ms (186 MB/s)
    lzo 2.05 1c_999     = 69688 ms (2 MB/s), 211957760->80397019, 1124 ms (184 MB/s)
    lzo 2.05 1f_999     = 76331 ms (2 MB/s), 211957760->80890513, 1181 ms (175 MB/s)
    lzo 2.05 1x_999     = 206147 ms (1 MB/s), 211957760->75302211, 1284 ms (161 MB/s)
    lzo 2.05 1y_999     = 203325 ms (1 MB/s), 211957760->75504114, 1277 ms (162 MB/s)
    lzo 2.05 1z_999     = 205937 ms (1 MB/s), 211957760->75061639, 1257 ms (164 MB/s)
    lzo 2.05 2a_999     = 64417 ms (3 MB/s), 211957760->82809608, 1593 ms (129 MB/s)
    lzrw1               = 3354 ms (61 MB/s), 211957760->113763206, 1640 ms (126 MB/s)
    lzrw1-a             = 3210 ms (64 MB/s), 211957760->112345946, 1607 ms (128 MB/s)
    lzrw2               = 3754 ms (55 MB/s), 211957760->105430138, 2083 ms (99 MB/s)
    lzrw3               = 3294 ms (62 MB/s), 211957760->100136247, 2363 ms (87 MB/s)
    lzrw3-a             = 9857 ms (20 MB/s), 211957760->90810520, 2456 ms (84 MB/s)
    snappy 1.0.3        = 3458 ms (59 MB/s), 211957760->104755703, 1005 ms (205 MB/s)
    tornado 0.666 16k/1 = 4069 ms (50 MB/s), 211957760->107391267, 1840 ms (112 MB/s)
    tornado 128k/2m     = 4927 ms (42 MB/s), 211957760->98475697, 2020 ms (102 MB/s)
    tornado 128k/8m     = 5155 ms (40 MB/s), 211957760->98082873, 2069 ms (100 MB/s)
    tornado 4m/8m       = 10641 ms (19 MB/s), 211957760->96103939, 2647 ms (78 MB/s)
    tornado b128k/8m    = 6181 ms (33 MB/s), 211957760->88062062, 3228 ms (64 MB/s)
    tornado b4m/8m      = 11425 ms (18 MB/s), 211957760->85838678, 3808 ms (54 MB/s)
    tornado b4m/32m     = 11802 ms (17 MB/s), 211957760->86000929, 3932 ms (52 MB/s)
    quicklz 1.5.0 -3    = 31202 ms (6 MB/s), 211957760->81822726, 1008 ms (205 MB/s)
    quicklz 1.5.0 -2    = 6366 ms (32 MB/s), 211957760->84554401, 2316 ms (89 MB/s)
    quicklz 1.5.0 -1    = 2348 ms (88 MB/s), 211957760->94724661, 1798 ms (115 MB/s)
    quicklz 1.5.1 b5 -1 = 2278 ms (90 MB/s), 211957760->94724661, 1799 ms (115 MB/s)
    all                 = 4761148 ms (0 MB/s), 0->0
    done... (3 iterations)
    Compression frontiers:
    Code:
    lzham alpha6 -m0d26 = 260575 ms (0 MB/s), 211957760->64043081, 23558 ms (8 MB/s)
    lzo 2.05 1z_999     = 205937 ms (1 MB/s), 211957760->75061639, 1257 ms (164 MB/s)
    lzo 2.05 1y_999     = 203325 ms (1 MB/s), 211957760->75504114, 1277 ms (162 MB/s)
    lzmat 1.1           = 44318 ms (4 MB/s), 211957760->76486345, 2961 ms (69 MB/s)
    quicklz 1.5.0 -3    = 31202 ms (6 MB/s), 211957760->81822726, 1008 ms (205 MB/s)
    quicklz 1.5.0 -2    = 6366 ms (32 MB/s), 211957760->84554401, 2316 ms (89 MB/s)
    tornado b128k/8m    = 6181 ms (33 MB/s), 211957760->88062062, 3228 ms (64 MB/s)
    quicklz 1.5.1 b5 -1 = 2278 ms (90 MB/s), 211957760->94724661, 1799 ms (115 MB/s)
    lzo 2.05 1y_1       = 628 ms (329 MB/s), 211957760->208714002, 337 ms (614 MB/s)
    Decompression frontiers:
    Code:
    lzham alpha6 -m0d26 = 260575 ms (0 MB/s), 211957760->64043081, 23558 ms (8 MB/s)
    lzmat 1.1           = 44318 ms (4 MB/s), 211957760->76486345, 2961 ms (69 MB/s)
    lzo 2.05 1z_999     = 205937 ms (1 MB/s), 211957760->75061639, 1257 ms (164 MB/s)
    lzo 2.05 1b_999     = 141919 ms (1 MB/s), 211957760->76594616, 1111 ms (186 MB/s)
    quicklz 1.5.0 -3    = 31202 ms (6 MB/s), 211957760->81822726, 1008 ms (205 MB/s)
    LZ4                 = 5189 ms (39 MB/s), 211957760->98040200, 1002 ms (206 MB/s)
    lzo 2.05 1y_1       = 628 ms (329 MB/s), 211957760->208714002, 337 ms (614 MB/s)
    lzo 2.05 1x_1       = 3462 ms (59 MB/s), 211957760->208714918, 334 ms (619 MB/s)
    I expected LZHAM to decompress faster.

  31. #28
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    892
    Thanks
    492
    Thanked 280 Times in 120 Posts
    Could you please update LZ4 source to its latest revision (r ?
    http://code.google.com/p/lz4/

    It should correct the segfault issue

    Rgds

  32. #29
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    inikeep, could you add NRV to the test?

  33. #30
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Quote Originally Posted by m^2 View Post
    inikeep, could you add NRV to the test?
    Where can I download NRV?

Page 1 of 4 123 ... LastLast

Similar Threads

  1. LZSS v0.01 is here!
    By encode in forum Data Compression
    Replies: 67
    Last Post: 28th March 2012, 11:10
  2. Replies: 23
    Last Post: 17th September 2011, 13:12
  3. Google released Snappy compression/decompression library
    By Sportman in forum Data Compression
    Replies: 11
    Last Post: 16th May 2011, 13:31
  4. LZSS with a large dictionary
    By encode in forum Data Compression
    Replies: 31
    Last Post: 31st July 2008, 22:15
  5. Fastest Compressors
    By LovePimple in forum Forum Archive
    Replies: 0
    Last Post: 1st November 2006, 06:36

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •