Page 2 of 3 FirstFirst 123 LastLast
Results 31 to 60 of 70

Thread: LZ5 - a modification of LZ4 which gives a better ratio at cost of slower compression

  1. #31
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Good work m^2. The bug is fixed in "dev". Thanks.

  2. #32
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Your gratitude gave me a lot of positive energy. Thank you for that.

  3. #33
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    LZ5 v1.3.3 beta can be downloaded at:
    https://github.com/inikep/lz5/archive/dev.zip

    Changes:
    - added: new levels from 11 to 16 (maximum compression ratio)
    - added: a new parser: LZ5HC_optimal_price
    - fixed: buffer-overflow during decompression (thanks to m^2)

    Please help me with fuzzing. Here is a quick comparison:

    Code:
    Compressor name              Compression Decompress. Compr. size  Ratio
    crush 1.0 level 0               14 MB/s    195 MB/s     50419812  48.08 
    crush 1.0 level 1             4.09 MB/s    211 MB/s     48195021  45.96
    crush 1.0 level 2             0.55 MB/s    214 MB/s     47105187  44.92
    lz5hc r132 level 10             12 MB/s    670 MB/s     48109030  45.88
    lz5hc r132 level 11           6.60 MB/s    592 MB/s     47639520  45.43
    lz5hc r132 level 12           3.22 MB/s    670 MB/s     47461368  45.26
    lz5hc r132 level 13           1.09 MB/s    642 MB/s     47382059  45.19
    lz5hc v1.3.3 level 10           12 MB/s    677 MB/s     48109030  45.88
    lz5hc v1.3.3 level 11         8.76 MB/s    692 MB/s     47438817  45.24
    lz5hc v1.3.3 level 12         7.02 MB/s    651 MB/s     47063261  44.88
    lz5hc v1.3.3 level 13         4.75 MB/s    645 MB/s     46718698  44.55
    lz5hc v1.3.3 level 14         3.84 MB/s    671 MB/s     46484969  44.33
    lz5hc v1.3.3 level 15         1.93 MB/s    712 MB/s     46227364  44.09
    lz5hc v1.3.3 level 16         0.80 MB/s    709 MB/s     46125742  43.99
    Last edited by inikep; 18th December 2015 at 21:06.

  4. Thanks:

    m^2 (19th December 2015)

  5. #34
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Are there any changes in decompression?
    I have not finished fuzzing it yet.

  6. #35
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    There is a small change in the decompression.
    However if decompression in r132 works correctly it will work correctly also for v1.3.3.

  7. #36
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    Code:
    katmacadapc% ./fullbench-notrap data_c/000-000 
    *** LZ5 speed analyzer v1.3.3 64-bits, by Yann Collet (Dec 28 2015) ***
    Compression functions : 
    ../lib/lz5.c:871:25: runtime error: null pointer passed as argument 2, which is declared to never be null
    /usr/include/string.h:50:14: note: nonnull attribute specified here
    SUMMARY: AddressSanitizer: undefined-behavior ../lib/lz5.c:871:25 in 
    ERROR ! LZ5_saveDict() = 0 !! 
    fullbench-notrap: fullbench.c:773: int fullSpeedBench(char **, int): Assertion `0' failed.
    zsh: abort      ./fullbench-notrap data_c/000-000
    Note: line numers may differ in fullbench.c as I needed to modify it.
    Attached Files Attached Files
    • File Type: 7z 1.7z (116 Bytes, 73 views)

  8. Thanks:

    inikep (4th January 2016)

  9. #37
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Thanks. I'll take care of it when I get back from vacation.

  10. #38
    Member
    Join Date
    Nov 2015
    Location
    boot ROM
    Posts
    92
    Thanks
    27
    Thanked 15 Times in 14 Posts
    Quote Originally Posted by inikep View Post
    ...
    Code:
    Compressor name              Compression Decompress. Compr. size  Ratio
    ...
    crush 1.0 level 2             0.55 MB/s    214 MB/s     47105187  44.92
    ...
    lz5hc v1.3.3 level 16         0.80 MB/s    709 MB/s     46125742  43.99
    Whoa, now it can beat Crush? Being 3x+ faster on decompression? Sounds interesting, let's see how it performs on my data :P.

  11. #39
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    @m^3: Please check https://github.com/inikep/lz5/tree/dev if bugs inherited from LZ4 are removed.

  12. #40
    Member
    Join Date
    Nov 2015
    Location
    boot ROM
    Posts
    92
    Thanks
    27
    Thanked 15 Times in 14 Posts
    Btw, I've gave -dev version of LZ5 a try. On compressing Linux kernel it beats crush any day . Yet, on other data it may or may not beat it, really depends on nature of data. In some cases crush happens to be just too strong and wins. However, overall it quite an improvement. This world clearly gone crazy: byte aligned LZ beating bit aligned, and another bit-aligned LZ shows to LZ+entropy things who is real king of the LZ compression.

    NB: I've found table defining levels and attempted to "turbocharge" it a little bit on lvl 16, however, it seems I've got negligible gains.

    What are "technical" limits on various members of
    Code:
        U32 windowLog;     /* largest match distance : impact decompression buffer size */
        U32 contentLog;    /* full search segment : larger == more compression, slower, more memory (useless for fast) */
        U32 hashLog;       /* dispatch table : larger == more memory, faster*/
        U32 hashLog3;      /* dispatch table : larger == more memory, faster*/
        U32 searchNum;     /* nb of searches : larger == more compression, slower*/
        U32 searchLength;  /* size of matches : larger == faster decompression */
        U32 sufficientLength;  /* used only by optimal parser: size of matches which is acceptable: larger == more compression, slower */
        LZ5HC_strategy strategy;
    ...while compressor still would generate valid output and (unmodified) decoder can still parse it?

    E.g. could I increase dictionary? And what does it means "impact decompression buffer size"? I've thought this data format "requires no memory for decompression". Have I seriously missed something?

  13. #41
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    The limits:
    Code:
      { 10, 10, 10,  0,     0,  4,     0,  LZ5HC_fast          }, // min values
      { 24, 24, 28, 24, 1<<24,  7, 1<<24,  LZ5HC_optimal_price }, // max values
    In my experiments windowLog=22 gives the best ratio (on average). It means that the compressor finds matches with offset up to 1<<22. Therefore during decompression LZ5 requires access up to 4 MB (22-bits) recently decoded bytes. LZ4 uses only 64 KB dictionary (16-bits).
    Last edited by inikep; 4th January 2016 at 22:36.

  14. Thanks:

    xcrh (4th January 2016)

  15. #42
    Member
    Join Date
    Nov 2015
    Location
    boot ROM
    Posts
    92
    Thanks
    27
    Thanked 15 Times in 14 Posts
    Quote Originally Posted by inikep View Post
    The limits:
    thanks for definitive answer!
    In my experiments windowLog=22 gives the best ratio (on average).
    It seems it depends on what you compress (and how large & redundant it happens to be). I've played with crush a bit some time before and got idea too big dictionary could be bad. Though I guess it depends on amount of data and their nature.

    It can look like this:
    Code:
    lz5hc r132 level 6            6.22 MB/s    244 MB/s        97972  42.62
    lz5hc r132 level 7            5.71 MB/s    242 MB/s        97992  42.62
    lz5hc r132 level 9            2.16 MB/s    238 MB/s        98088  42.67
    lz5hc r132 level 12           1.77 MB/s    228 MB/s        99986  43.49
    lz5hc r132 level 16           1.56 MB/s    228 MB/s        99932  43.47
    ...so lz5hc lvl 6 was the best. Lower and higher levels were worse on this small file. I've used LZ5 from your -dev branch (I've put it to lzbench myself).

    On other hand, increasing parameters and making dict larger makes sense in some cases.
    E.g. ubuntu x86-64 Linux kernel, ~22Mb chunk of code&data which is redundant (large areas of zeros, etc):
    Code:
    //LZ5HC 16, stock from -dev:
    lz5hc r132 level 16           0.37 MB/s    179 MB/s      7130425  31.27
    
    //"supercharged" level 16, dictionary not touched, matches tweaked - { 22, 22, 23, 16,  8192, 3,8192,LZ5HC_optimal_price }
    lz5hc r132 level 16           0.03 MB/s    177 MB/s      7122521  31.24
    
    //"ultimate" level 16 - also increased dicionary & hash tables - { 24, 24, 27, 24,  8192, 3,8192,LZ5HC_optimal_price },
    lz5hc r132 level 16           0.03 MB/s    171 MB/s      7007389  30.74
    
    //zlib & crush - for reference purposes:
    zlib 1.2.8 level 9            3.21 MB/s     85 MB/s      6781199  29.74
    zlib 1.2.8 level 6            7.83 MB/s     85 MB/s      6793262  29.80
    crush 1.0 level 2             0.25 MB/s     78 MB/s      7267494  31.88
    Ok, "ultimate" setup is very slow (but still faster than ratio-inclined things like LZOMA) and also hogs memory (not surprising). Getting rid of ~120Kb isn't bad, granted decompression speed stays, but hardly epic. And of course it makes no effect on small files. Btw, crush 2 got outperformed in all cases, though maybe it is a good idea to try various dict sizes for crush to see what it can. I wonder how bad my settings are, if we assume I want to get rid a lot of zeros, etc but preferrably without waiting for a week.

    Let's take a look on competitors in more or less the same league...
    Code:
    lz4hc r131 level 16           1.31 MB/s    520 MB/s      8732628  38.30 
    lzo1z 2.09 level 999          2.53 MB/s    148 MB/s      7543941  33.09
    lzo1x 2.09 level 999          2.49 MB/s    153 MB/s      7933488  34.80
    lzo1y 2.09 level 999          2.51 MB/s    153 MB/s      7953888  34.89
    lzo2a 2.09 level 999          6.72 MB/s    124 MB/s      8079096  35.44
    lzo1f 2.09 level 999          5.57 MB/s    157 MB/s      8265099  36.25
    lzo1b 2.09 level 999          3.38 MB/s    177 MB/s      8556961  37.53
    lzo1c 2.09 level 999          5.92 MB/s    187 MB/s      8612912  37.78
    1) lz4: rocks in terms of decompression speed, but ratio is nowhere close.
    2) LZO got outperformed quite a bit: LZ5 achieved better ratio AND decompression speed. LZO only gets comparable decompression speed if ratio is crap and if someone ok with such ratio, LZ4 would decompress much faster. Strangely, builtin stub uses lzo1x, while 1z is clearly better tradeoff.

    It means that the compressor finds matches with offset up to 1<<22. Therefore during decompression LZ5 requires access up to 4 MB (22-bits) recently decoded bytes. LZ4 uses only 64 KB dictionary (16-bits).
    Ok, I got it, it means you can't flush and discard these data. Not a big deal if one decompresses src block to dst block and keeps source and destination in memory, and in this sense, no extra memory needed. But if one does not needs these data in memory and want to flush it, its not an option and that's what this note really means. Thanks for explaination

  16. Thanks:

    inikep (5th January 2016)

  17. #43
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    LZ5 v1.3.3 (final) can be downloaded at:
    https://github.com/inikep/lz5/releases

    Changes from r132:
    - added: new levels from 11 to 18 (maximum compression ratio)
    - added: a new parser: LZ5HC_optimal_price
    - fixed: buffer-overflow during decompression (thanks to m^2)

    Here is a quick comparison:
    Code:
    Compressor name              Compression Decompress. Compr. size  Ratio
    crush 1.0 level 0               14 MB/s    195 MB/s     50419812  48.08
    crush 1.0 level 1             4.09 MB/s    211 MB/s     48195021  45.96
    crush 1.0 level 2             0.55 MB/s    214 MB/s     47105187  44.92
    lz5hc r132 level 10             12 MB/s    670 MB/s     48109030  45.88
    lz5hc r132 level 11           6.60 MB/s    592 MB/s     47639520  45.43
    lz5hc r132 level 12           3.22 MB/s    670 MB/s     47461368  45.26
    lz5hc r132 level 13           1.09 MB/s    642 MB/s     47382059  45.19
    lz5hc v1.3.3 level 10           12 MB/s    677 MB/s     48109030  45.88
    lz5hc v1.3.3 level 11         8.76 MB/s    692 MB/s     47438817  45.24
    lz5hc v1.3.3 level 12         7.02 MB/s    651 MB/s     47063261  44.88
    lz5hc v1.3.3 level 13         4.75 MB/s    645 MB/s     46718698  44.55
    lz5hc v1.3.3 level 14         3.84 MB/s    671 MB/s     46484969  44.33
    lz5hc v1.3.3 level 15         1.93 MB/s    712 MB/s     46227364  44.09
    lz5hc v1.3.3 level 16         0.80 MB/s    709 MB/s     46125742  43.99
    lz5hc v1.3.3 level 17         0.39 MB/s    679 MB/s     46050114  43.92
    lz5hc v1.3.3 level 18         0.16 MB/s    541 MB/s     46008853  43.88

  18. Thanks (3):

    comp1 (5th January 2016),Stephan Busch (8th January 2016),xcrh (8th January 2016)

  19. #44
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    I tested lz5 on LTCB and Silesia.
    http://mattmahoney.net/dc/text.html#3196
    http://mattmahoney.net/dc/silesia.html

    I've fallen behind on testing other compressors. If you're in a hurry, you can test yourself and send me results.

  20. Thanks:

    inikep (6th January 2016)

  21. #45
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    inikep, I have a request for you. When you find a bug inherited from LZ4 (IIRC there are 2 already), report them back to Yann.

    I'll test your changes in the following days.

  22. #46
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    I reported bugs in LZ4 at https://github.com/Cyan4973/lz4/issues/178

  23. Thanks (3):

    m^2 (7th January 2016),nemequ (7th January 2016),xcrh (8th January 2016)

  24. #47
    Member
    Join Date
    Nov 2015
    Location
    boot ROM
    Posts
    92
    Thanks
    27
    Thanked 15 Times in 14 Posts
    Okay, thanks to new match finder in -dev, it seems I've got new "high-score" while compressing kernel using LZ5...
    Code:
    Compressor name              Compression Decompress. Compr. size  Ratio 
    memcpy                          32 MB/s     61 MB/s     22799232 100.00
    lz5hc r132 level 15           0.59 MB/s    176 MB/s      6978174  30.61
    zlib 1.2.8 level 1              20 MB/s     86 MB/s      7268620  31.88
    zlib 1.2.8 level 3              16 MB/s     89 MB/s      7062079  30.98
    zlib 1.2.8 level 6            7.80 MB/s     85 MB/s      6793262  29.80
    zlib 1.2.8 level 9            3.21 MB/s     86 MB/s      6781199  29.74
    I've used custom lvl 15, { 24, 24, 27, 24, 1<<11, 4, 1<<11, 1, LZ5HC_optimal_price_bt }, // level 15 - also happens to be "faster" (if we compare to previous similar thing)

    Some observations:
    1) Setting match searching above 1<<10 gives quite little effect, and above 1<<11 there're very little gains, while killing compression speed a lot.
    2) On this relatively large file, large window pays for itself, noticeably improving ratio.
    3) On my laptop running all sorts of crap hashLog=28 probably ran out of memory, because I've got not-so-inspiring 100% ratio . After trying various hashLogs a bit, I've got idea it does not hurts speed much with my settings, etc and hashLog=27 is quite ok to try my "ultimate" level, it does not runs out of mem.
    4) I've been curious if I can get below 7 000 000 bytes mark. Hmm, ok, now I can!

    5) On some another file (uncompressed PC-generated graphics from some game, very highly compressible PNG with "stored" level, a bit above 2Mb in size) only my custom levels with increased window managed to beat Crush 2 :P. Crush did very well here. Speeds are exagerrated due to extreme ratios.

    Code:
    $ ./lzbench -elz5hc,3,6,8,9,10,12,13,14,15/zlib,1,3,6,9/crush,2 tanktop-female.png 
    lzbench 0.9.1 (64-bit Linux)   Assembled by P.Skibinski
    Compressor name              Compression Decompress. Compr. size  Ratio 
    memcpy                        1830 MB/s   1830 MB/s      2216456 100.00
    lz5hc r132 level 3              99 MB/s   1019 MB/s        51325   2.32
    lz5hc r132 level 6              19 MB/s   1048 MB/s        40738   1.84
    lz5hc r132 level 8              18 MB/s   1054 MB/s        40657   1.83
    lz5hc r132 level 9              13 MB/s   1079 MB/s        39140   1.77
    lz5hc r132 level 10             12 MB/s   1098 MB/s        38454   1.73
    lz5hc r132 level 12             17 MB/s   1019 MB/s        40031   1.81
    lz5hc r132 level 13           7.34 MB/s   1054 MB/s        40796   1.84
    lz5hc r132 level 14           0.03 MB/s   1092 MB/s        36428   1.64
    lz5hc r132 level 15           0.06 MB/s   1087 MB/s        36362   1.64
    zlib 1.2.8 level 1              89 MB/s    298 MB/s        46346   2.09
    zlib 1.2.8 level 3              87 MB/s    303 MB/s        44820   2.02
    zlib 1.2.8 level 6              43 MB/s    185 MB/s        33968   1.53
    zlib 1.2.8 level 9            9.85 MB/s    191 MB/s        32081   1.45
    crush 1.0 level 2             1.31 MB/s    309 MB/s        36698   1.66
    Here lvl 14 is also custom, { 23, 23, 23, 23, 1<<12, 4, 1<<12, 1, LZ5HC_optimal_price_bt }, // level 14 - smaller dict, but match finding is probably overaggressive and speed fell even below that of lvl 15 while it did a bit better and twice as fast .


    Bonus:
    ...what about beating ... zlib 9! Compressing lzbench itself, using mentioned 14/15 with increased windows.
    Code:
    lz5hc r132 level 13           2.19 MB/s    263 MB/s      1978115  23.99
    lz5hc r132 level 14           1.38 MB/s    265 MB/s      1959714  23.77
    lz5hc r132 level 15           0.91 MB/s    265 MB/s      1959808  23.77
    zlib 1.2.8 level 1              25 MB/s    101 MB/s      2192223  26.59
    zlib 1.2.8 level 3              19 MB/s    105 MB/s      2107582  25.56
    zlib 1.2.8 level 6            9.76 MB/s     98 MB/s      1982685  24.05
    zlib 1.2.8 level 9            3.71 MB/s     98 MB/s      1970716  23.90
    crush 1.0 level 2             0.42 MB/s     96 MB/s      1975334  23.96
    Zlib had disadvantage - small window size, while staticaly linked file is ~8Mb sized. On other hand, it had huffman on its side. Fancy thing also the fact decompression speed of LZ5 scored ~3x faster than zlib. That's what I call "nice shot" when it comes to compressing something and getting interesting result.

    Now LZ5 overall looks quite interesting! It beats highest LZO levels while being a bit faster to decompress. But compresion ratio could resemble UCL. While 2x-3x faster to decompress. Whatever, LZ5 seems to be interesting tradeoff when one somewhat cares about ratio but needs high decompression speed. Not as blazing fast as LZ4, but a bit above of LZO.
    Last edited by xcrh; 8th January 2016 at 06:38.

  25. Thanks:

    inikep (8th January 2016)

  26. #48
    Member
    Join Date
    Jul 2013
    Location
    United States
    Posts
    194
    Thanks
    44
    Thanked 140 Times in 69 Posts
    m^3: did you finish fuzzing? I added your results to CompFuzz, and I'm wondering if I should mark LZ5 as "OK" in the results table.

  27. #49
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    No, I'm not done. I share my CPU between ZSTD and LZ5 (the latter is gets c.a. 80%) and both run for months already.
    It's hard to predict how long will a session take, but I expect that I need another 1-5 weeks. Unless there are new commits that add more work.
    Nevertheless I think you can mark it OK because:
    * it has received a lot of fuzzing already and
    * there are no known issues

  28. #50
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Quote Originally Posted by xcrh View Post
    Okay, thanks to new match finder in -dev, it seems I've got new "high-score" while compressing kernel using LZ5...
    I think that you can get even better results with content-aware compression (with a compressor designed especially for a given kind(s) of data). This is the future of lossless data compression because currently in general-purpose compression people are working mainly on speed improvements.

  29. #51
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    The latest dev compiles with many warnings:
    Code:
    afl-clang-fast 1.94b by <lszekeres@google.com>
    In file included from lz5.c:41:
    ./lz5common.h:184:30: warning: named variadic macros are a GNU extension [-Wvariadic-macros]
    #define LZ5HC_DEBUG(fmt, args...) ;//printf(fmt, ##args)
                                 ^
    ./lz5common.h:185:33: warning: named variadic macros are a GNU extension [-Wvariadic-macros]
    #define LZ5_LOG_PARSER(fmt, args...) ;//printf(fmt, ##args)
                                    ^
    ./lz5common.h:186:32: warning: named variadic macros are a GNU extension [-Wvariadic-macros]
    #define LZ5_LOG_PRICE(fmt, args...) ;//printf(fmt, ##args)
                                   ^
    ./lz5common.h:187:33: warning: named variadic macros are a GNU extension [-Wvariadic-macros]
    #define LZ5_LOG_ENCODE(fmt, args...) ;//printf(fmt, ##args)
                                    ^
    In file included from lz5.c:41:
    In file included from ./lz5common.h:140:
    ./mem.h:305:17: warning: unused function 'MEM_highbit' [-Wunused-function]
    static unsigned MEM_highbit(U32 val)
                    ^
    ./mem.h:420:13: warning: unused function 'MEM_wildcopy' [-Wunused-function]
    static void MEM_wildcopy(void* dst, const void* src, size_t length)
                ^
    In file included from lz5.c:41:
    ./lz5common.h:150:15: warning: unused function 'LZ5HC_hash3Ptr' [-Wunused-function]
    static size_t LZ5HC_hash3Ptr(const void* ptr, U32 h) { return LZ5HC_hash3(MEM_read32(ptr), h); }
                  ^
    ./lz5common.h:168:15: warning: unused function 'LZ5HC_hashPtr' [-Wunused-function]
    static size_t LZ5HC_hashPtr(const void* p, U32 hBits, U32 mls)
                  ^
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 3840 locations (non-hardened mode, ratio 100%).
    8 warnings generated.
    In file included from lz5hc.c:42:
    ./lz5common.h:184:30: warning: named variadic macros are a GNU extension [-Wvariadic-macros]
    #define LZ5HC_DEBUG(fmt, args...) ;//printf(fmt, ##args)
                                 ^
    ./lz5common.h:185:33: warning: named variadic macros are a GNU extension [-Wvariadic-macros]
    #define LZ5_LOG_PARSER(fmt, args...) ;//printf(fmt, ##args)
                                    ^
    ./lz5common.h:186:32: warning: named variadic macros are a GNU extension [-Wvariadic-macros]
    #define LZ5_LOG_PRICE(fmt, args...) ;//printf(fmt, ##args)
                                   ^
    ./lz5common.h:187:33: warning: named variadic macros are a GNU extension [-Wvariadic-macros]
    #define LZ5_LOG_ENCODE(fmt, args...) ;//printf(fmt, ##args)
                                    ^
    lz5hc.c:102:102: warning: unused parameter 'iLimit' [-Wunused-parameter]
    FORCE_INLINE void LZ5HC_BinTree_Insert (LZ5HC_Data_Structure* ctx, const BYTE* ip, const BYTE* const iLimit)
                                                                                                         ^
    lz5hc.c:859:32: warning: cast from 'const char *' to 'unsigned char *' drops const qualifier [-Wcast-qual]
        ctx->inputBuffer = (BYTE*) source;
                                   ^
    lz5hc.c:1288:32: warning: cast from 'const char *' to 'unsigned char *' drops const qualifier [-Wcast-qual]
        ctx->inputBuffer = (BYTE*) source;
                                   ^
    lz5hc.c:1343:30: warning: cast from 'const unsigned char *' to 'unsigned char *' drops const qualifier [-Wcast-qual]
            best_pos = (uint8_t*)ip;
                                 ^
    lz5hc.c:1345:26: warning: cast from 'const unsigned char *' to 'unsigned char *' drops const qualifier [-Wcast-qual]
            off0 = (uint8_t*)ip - ref;
                             ^
    lz5hc.c:1348:30: warning: cast from 'const unsigned char *' to 'unsigned char *' drops const qualifier [-Wcast-qual]
            for (pos = (uint8_t*)ip + ml; pos >= start2; pos--)
                                 ^
    lz5hc.c:1433:32: warning: cast from 'const char *' to 'unsigned char *' drops const qualifier [-Wcast-qual]
        ctx->inputBuffer = (BYTE*) source;
                                   ^
    lz5hc.c:1564:32: warning: cast from 'const char *' to 'unsigned char *' drops const qualifier [-Wcast-qual]
        ctx->inputBuffer = (BYTE*) source;
                                   ^
    In file included from lz5hc.c:42:
    In file included from ./lz5common.h:140:
    ./mem.h:305:17: warning: unused function 'MEM_highbit' [-Wunused-function]
    static unsigned MEM_highbit(U32 val)
                    ^
    ./mem.h:420:13: warning: unused function 'MEM_wildcopy' [-Wunused-function]
    static void MEM_wildcopy(void* dst, const void* src, size_t length)
                ^
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 3205 locations (non-hardened mode, ratio 100%).
    14 warnings generated.
    lz5frame.c:898:74: warning: comparison of constant 4194304 with expression of type 'LZ5F_blockMode_t' is always false [-Wtautological-constant-out-of-range-compare]
        bufferNeeded = dctxPtr->maxBlockSize + ((dctxPtr->frameInfo.blockMode==LZ5F_DICT_SIZE) * 2 * LZ5F_DICT_SIZE);
                                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ ~~~~~~~~~~~~~~
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 642 locations (non-hardened mode, ratio 100%).
    1 warning generated.
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 179 locations (non-hardened mode, ratio 100%).
    creating versioned links
    make[1]: Leaving directory `/home/madamczyk/afl-1.94b/lz5/lib'
    ../../afl-clang-fast -DFORTIFY_SOURCE=2 -fstack-protector-all -fsanitize=undefined,address -fsanitize-trap=array-bounds,bool,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,nonnull-attribute,null,object-size,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,vla-bound -fno-sanitize=alignment -g -I../lib -DXXH_NAMESPACE=LZ5_ -DLZ5_VERSION=\"v1.4\"  -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math     ../lib/lz5.c -c -o ../lib/lz5.o
    afl-clang-fast 1.94b by <lszekeres@google.com>
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 3840 locations (non-hardened mode, ratio 100%).
    ../../afl-clang-fast -DFORTIFY_SOURCE=2 -fstack-protector-all -fsanitize=undefined,address -fsanitize-trap=array-bounds,bool,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,nonnull-attribute,null,object-size,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,vla-bound -fno-sanitize=alignment -g -I../lib -DXXH_NAMESPACE=LZ5_ -DLZ5_VERSION=\"v1.4\"  -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math     ../lib/lz5hc.c -c -o ../lib/lz5hc.o
    afl-clang-fast 1.94b by <lszekeres@google.com>
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 3205 locations (non-hardened mode, ratio 100%).
    ../../afl-clang-fast -DFORTIFY_SOURCE=2 -fstack-protector-all -fsanitize=undefined,address -fsanitize-trap=array-bounds,bool,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,nonnull-attribute,null,object-size,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,vla-bound -fno-sanitize=alignment -g -I../lib -DXXH_NAMESPACE=LZ5_ -DLZ5_VERSION=\"v1.4\"  -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math     ../lib/lz5frame.c -c -o ../lib/lz5frame.o
    afl-clang-fast 1.94b by <lszekeres@google.com>
    ../lib/lz5frame.c:898:74: warning: comparison of constant 4194304 with expression of type 'LZ5F_blockMode_t' is always false [-Wtautological-constant-out-of-range-compare]
        bufferNeeded = dctxPtr->maxBlockSize + ((dctxPtr->frameInfo.blockMode==LZ5F_DICT_SIZE) * 2 * LZ5F_DICT_SIZE);
                                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ ~~~~~~~~~~~~~~
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 642 locations (non-hardened mode, ratio 100%).
    1 warning generated.
    ../../afl-clang-fast -DFORTIFY_SOURCE=2 -fstack-protector-all -fsanitize=undefined,address -fsanitize-trap=array-bounds,bool,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,nonnull-attribute,null,object-size,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,vla-bound -fno-sanitize=alignment -g -I../lib -DXXH_NAMESPACE=LZ5_ -DLZ5_VERSION=\"v1.4\"  -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math     ../lib/xxhash.c -c -o ../lib/xxhash.o
    afl-clang-fast 1.94b by <lszekeres@google.com>
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 179 locations (non-hardened mode, ratio 100%).
    ../../afl-clang-fast -DFORTIFY_SOURCE=2 -fstack-protector-all -fsanitize=undefined,address -fsanitize-trap=array-bounds,bool,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,nonnull-attribute,null,object-size,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,vla-bound -fno-sanitize=alignment -g -I../lib -DXXH_NAMESPACE=LZ5_ -DLZ5_VERSION=\"v1.4\"  -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math     fullbench.c -c -o fullbench.o
    afl-clang-fast 1.94b by <lszekeres@google.com>
    afl-llvm-pass 1.94b by <lszekeres@google.com>
    [+] Instrumented 353 locations (non-hardened mode, ratio 100%).
    ../../afl-clang-fast -DFORTIFY_SOURCE=2 -fstack-protector-all -fsanitize=undefined,address -fsanitize-trap=array-bounds,bool,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,nonnull-attribute,null,object-size,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,vla-bound -fno-sanitize=alignment -g -I../lib -DXXH_NAMESPACE=LZ5_ -DLZ5_VERSION=\"v1.4\"  -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math     ../lib/lz5.o ../lib/lz5hc.o ../lib/lz5frame.o ../lib/xxhash.o fullbench.o -o fullbench
    afl-clang-fast 1.94b by <lszekeres@google.com>
    One even looks like a bug ( lz5frame.c:898 )

  30. Thanks:

    inikep (11th January 2016)

  31. #52
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    Code:
    katmacadapc% ../../../fullbench -c10 -d9 -i1 id:000000*               
    *** LZ5 speed analyzer v1.4 64-bits, by Yann Collet (Jan 11 2016) ***
    - 1 iterations -
    LZ5_compressHC
    LZ5F_decompress
    =================================================================
    ==27488==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x631000024800 at pc 0x000000516683 bp 0x7fff477a0020 sp 0x7fff477a0018
    WRITE of size 8 at 0x631000024800 thread T0
        #0 0x516682  (/home/madamczyk/afl-1.94b/lz5/fullbench+0x516682)
        #1 0x565e8d  (/home/madamczyk/afl-1.94b/lz5/fullbench+0x565e8d)
        #2 0x571d12  (/home/madamczyk/afl-1.94b/lz5/fullbench+0x571d12)
        #3 0x56feff  (/home/madamczyk/afl-1.94b/lz5/fullbench+0x56feff)
        #4 0x572a96  (/home/madamczyk/afl-1.94b/lz5/fullbench+0x572a96)
        #5 0x7f090c344eac  (/lib/x86_64-linux-gnu/libc.so.6+0x1eeac)
        #6 0x41b578  (/home/madamczyk/afl-1.94b/lz5/fullbench+0x41b578)
    
    0x631000024800 is located 0 bytes to the right of 65536-byte region [0x631000014800,0x631000024800)
    allocated by thread T0 here:
        #0 0x4b8952  (/home/madamczyk/afl-1.94b/lz5/fullbench+0x4b8952)
        #1 0x56849c  (/home/madamczyk/afl-1.94b/lz5/fullbench+0x56849c)
    
    SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/madamczyk/afl-1.94b/lz5/fullbench+0x516682) 
    Shadow bytes around the buggy address:
      0x0c627fffc8b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0c627fffc8c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0c627fffc8d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0c627fffc8e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0c627fffc8f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    =>0x0c627fffc900:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c627fffc910: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c627fffc920: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c627fffc930: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c627fffc940: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c627fffc950: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07 
      Heap left redzone:       fa
      Heap right redzone:      fb
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack partial redzone:   f4
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==27488==ABORTING
    Attached Files Attached Files

  32. #53
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    @m^3: I could not reproduce the bug in fullbench using gcc.

  33. #54
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    The code doesn't compile on my gcc, probably because the standard used is GNU89.
    Code:
    cc -O0 -I../lib -DXXH_NAMESPACE=LZ5_ -DLZ5_VERSION=\"v1.4\"  -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math     ../lib/lz5.c -c -o ../lib/lz5.o
    In file included from ../lib/lz5.c:41:0:
    ../lib/lz5common.h:209:1: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:209:39: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:209:56: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:209:73: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:214:37: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:214:56: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:214:78: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:214:92: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:214:109: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:220:40: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:220:59: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:220:81: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:220:95: error: unknown type name ‘uint32_t’
    ../lib/lz5common.h:220:126: error: unknown type name ‘uint32_t’
    make: *** [../lib/lz5.o] Error 1
    mv: cannot stat `fullbench': No such file or directory

  34. #55
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    The previous crashes are some issue that I introduced in fullbench.c, sorry.

  35. #56
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    LZ4 and LZ5 requires -std=c99

  36. #57
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    Then you can clean this up:
    Code:
    /****************************************************************
    *  Basic Types
    *****************************************************************/
    #if defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
    # include <stdint.h>
      typedef  uint8_t BYTE;
      typedef uint16_t U16;
      typedef  int16_t S16;
      typedef uint32_t U32;
      typedef  int32_t S32;
      typedef uint64_t U64;
      typedef  int64_t S64;
    #else
      typedef unsigned char       BYTE;
      typedef unsigned short      U16;
      typedef   signed short      S16;
      typedef unsigned int        U32;
      typedef   signed int        S32;
      typedef unsigned long long  U64;
      typedef   signed long long  S64;
    #endif
    And also make sure that -std=c99 is passed to the compiler regrdless custom CFLAGS. In lib it is. In programs it is not.

  37. #58
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Quote Originally Posted by m^3 View Post
    And also make sure that -std=c99 is passed to the compiler regrdless custom CFLAGS. In lib it is. In programs it is not.
    Thanks, it's fixed now. It will also solve a problem with too many warnings.

  38. #59
    Member
    Join Date
    Nov 2015
    Location
    boot ROM
    Posts
    92
    Thanks
    27
    Thanked 15 Times in 14 Posts
    Quote Originally Posted by inikep View Post
    I think that you can get even better results with content-aware compression (with a compressor designed especially for a given kind(s) of data). This is the future of lossless data compression because currently in general-purpose compression people are working mainly on speed improvements.
    I'm pretty sure ratio can be improved this way, e.g. E8E9 and its equivalent for ARM (I do not remember its name, but it exists) can improve compression, but it requires some extra actions on decompression. And at these speeds, extra actions in decompressor tend to cost quite some speed. This happens to be especially true for ARMs.

    To give some idea about speeds: on ARM board where I rolled some "dev" environment and did some experiments, xz decompresses at about 10 Mb/s, lzma decompresses at about 15Mb/s, zlib is about ~50 Mb/s, zstd around 60, LZ5 around 100, LZ4 about 130, several other "fast LZs" are scoring between LZ4 and LZ5 or so, and their ratio usually close to that of LZ4. So LZ4 is good for speed and LZ5 is good for better ratio without major loss of speed. At very most, LZ4 fast 17 goes about 170Mb/s. But its ratio is really bad. Blosclz can reach whopping 400+ Mb/s, exceeding memcpy, just as they advertise, but ratio is very low and only makes sense for mem-to-mem xfers. As you can see, e.g. any entropy coding immediately kills speed by about 2x.

  39. #60
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    LZ5 v1.4 beta can be downloaded at:
    https://github.com/inikep/lz5/archive/dev.zip

    Changes:
    - improved: levels from 13 to 15 (maximum compression ratio)
    - added: a new parser: LZ5HC_optimal_price_bt

    Please help me with fuzzing. Here is a quick comparison:

    Code:
    Compressor name              Compression Decompress. Compr. size  Ratio
    crush 1.0 level 0               14 MB/s    195 MB/s     50419812  48.08 
    crush 1.0 level 1             4.09 MB/s    211 MB/s     48195021  45.96
    crush 1.0 level 2             0.55 MB/s    214 MB/s     47105187  44.92
    lz5hc v1.3.3 level 13         4.75 MB/s    645 MB/s     46718698  44.55
    lz5hc v1.3.3 level 14         3.84 MB/s    671 MB/s     46484969  44.33
    lz5hc v1.3.3 level 15         1.93 MB/s    712 MB/s     46227364  44.09
    lz5hc v1.3.3 level 16         0.80 MB/s    709 MB/s     46125742  43.99
    lz5hc v1.4 level 13           5.38 MB/s    710 MB/s     46383307  44.23
    lz5hc v1.4 level 14           4.12 MB/s    669 MB/s     45843096  43.72
    lz5hc v1.4 level 15           2.16 MB/s    619 MB/s     45767126  43.65

  40. Thanks:

    Cyan (26th January 2016)

Page 2 of 3 FirstFirst 123 LastLast

Similar Threads

  1. Replies: 45
    Last Post: 25th November 2016, 04:30
  2. VMware tar modification
    By nimdamsk in forum Data Compression
    Replies: 5
    Last Post: 7th November 2011, 13:24
  3. decompression is ~1000 times slower than compression?
    By Alexander Rhatushnyak in forum Data Compression
    Replies: 17
    Last Post: 28th April 2010, 14:27
  4. paq9a modification
    By kaitz in forum Data Compression
    Replies: 7
    Last Post: 28th September 2008, 05:46
  5. Compiler related: Intel's code slower on AMD-CPUs?!
    By Vacon in forum Data Compression
    Replies: 5
    Last Post: 10th May 2008, 18:56

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •