Results 1 to 19 of 19

Thread: lzpm 0.07 overview

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Some news about this project. This version may represent a new generation of my compressors.

    Firstly, this version has no file size limits (64-bit file sizes)! Also switched to pure hash chaining as a result we can see some compression gain, and even speed improvement. However, now LZPM's memory usage is N + (16*N) - i.e. 17*N. Buffer size: 16 MB, so you can calculate memory usage. Note that for decompression it still uses N + 4 MB (i.e. 20 MB) of memory. In addition, many stuff optimized and renewed, added compression options like 'max optimize lookahead'. Actually, I'm thinking about how to call such parameter more accurately. This one defines how many bytes LZPM must check ahead with lazy matching. Zero disables an improved parsing, a higher numbers provide higher compression at some cost of the compression time. Decoder with all modes stays untouched. Higher numbers always work slower, but not always provide higher compression. Valid values are from 0 to 3. The best choice depends on each file - for example, for some files the '1' is the best, for others '3', etc.
    Some testing results:

    world95.txt
    LZPM 0.07, c0: 658,403 bytes
    LZPM 0.07, c1: 617,255 bytes
    LZPM 0.07, c2: 603,267 bytes
    LZPM 0.07, c3: 598,837 bytes

    english.dic
    LZPM 0.07, c0: 1,017,266 bytes
    LZPM 0.07, c1: 1,015,902 bytes
    LZPM 0.07, c2: 1,031,857 bytes
    LZPM 0.07, c3: 1,048,941 bytes

    rafale.bmp
    LZPM 0.07, c0: 1,129,188 bytes
    LZPM 0.07, c1: 1,102,955 bytes
    LZPM 0.07, c2: 1,094,724 bytes
    LZPM 0.07, c3: 1,093,084 bytes

    acrord32.exe
    LZPM 0.07, c0: 1,649,978 bytes
    LZPM 0.07, c1: 1,634,282 bytes
    LZPM 0.07, c2: 1,630,196 bytes
    LZPM 0.07, c3: 1,632,103 bytes

    Also I played with Flexible Parsing. Concluding, hash chains based match finder works not so well in this case - improvement over lazy matching is tiny and at the same time compression speed is greatly decreased. Brute force match finder may help, but compression takes forever. In future I will try to find more efficient match finder in terms of memory use (17*N is okay, but at some point is too heavy) or in terms of efficiency with flexible/optimal parsing.


  2. #2
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Will overall compression ratio be much better than previous versions?

  3. #3
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    LZPM 0.06's numbers from SFC (only these 4 files):
    world95.txt - 603 251
    english.dic - 1 031 885
    rafale.bmp - 1 094 750
    acrord32.exe - 1 630 322

    No substantial improvements, it seems.
    There are improvements though, thanks for info, encode

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by LovePimple
    Will overall compression ratio be much better than previous versions?
    As Black_Fox noted there is no killer compression improvement, since principally compression scheme stays untouched - I just improved/optimized some aspects.

    Anyway, currently Im thinking how to organize command line interface - i.e. which switches and behavour to use.

    For example, I have an idea to add optional filters like exe and delta. Thus the user will able to control preprocessing stage. Just thoughts:
    -t0 - no transform
    -t1 - delta transformer with lag 1
    ...
    -t4 - delta transformer with lag 4
    -te - exe transformer

    This set of filters can cover a wide range of different file types. For example:

    24-bit pictures: -t3
    32-bit pictures: -t4

    8-bit mono audio: -t1
    8-bit stereo audio: -t2
    16-bit mono audio: -t2
    16-bit stereo audio: -t4

    executables: -te

    Not shure how to call max optimize lookahead (RKC calls it such). Maybe I may call it:
    Fast bytes
    or
    Word size
    or
    Optimize level
    or
    Longest/largest optimized match
    or
    ??

    Any ideas?


  5. #5
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,503
    Thanks
    741
    Thanked 665 Times in 359 Posts
    just "lookahead" should be enough

  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Extended max lookahead to the value of nine and added a nice heuristic to deal with such thing. Additionally, now you can control the compression speed:
    0 - is the fastest
    9 - is slowest and best for text files

    Now LZPM usage looks like this:

    lzpm 0.07 copyright (c) 2007 ilia muraviev
    usage: lzpm command input output
    commands:
    0..9 = compress with given lookahead
    d = decompress

    If you do not agree with "compress with given lookahead" definition, please post your own!

    Some testing results:

    ENWIK8
    LZPM 0.07, 0: 30,601,555 bytes
    LZPM 0.07, 1: 29,227,657 bytes
    LZPM 0.07, 2: 28,823,028 bytes
    LZPM 0.07, 3: 28,746,141 bytes
    LZPM 0.07, 4: 28,730,605 bytes
    LZPM 0.07, 5: 28,705,139 bytes
    LZPM 0.07, 6: 28,685,869 bytes
    LZPM 0.07, 7: 28,669,075 bytes
    LZPM 0.07, 8: 28,656,178 bytes
    LZPM 0.07, 9: 28,647,092 bytes

    ENWIK9
    LZPM 0.07, 9: 248,885,077 bytes

    world95.txt
    LZPM 0.07, 0: 656,771 bytes
    LZPM 0.07, 1: 616,120 bytes
    LZPM 0.07, 2: 602,394 bytes
    LZPM 0.07, 3: 598,496 bytes
    LZPM 0.07, 4: 598,107 bytes
    LZPM 0.07, 5: 597,512 bytes
    LZPM 0.07, 6: 597,278 bytes
    LZPM 0.07, 7: 596,923 bytes
    LZPM 0.07, 8: 596,442 bytes
    LZPM 0.07, 9: 595,939 bytes

    3200.txt
    LZPM 0.07, 0: 5,341,171 bytes
    LZPM 0.07, 1: 5,100,642 bytes
    LZPM 0.07, 2: 5,043,820 bytes
    LZPM 0.07, 3: 5,037,113 bytes
    LZPM 0.07, 4: 5,036,208 bytes
    LZPM 0.07, 5: 5,030,811 bytes
    LZPM 0.07, 6: 5,024,945 bytes
    LZPM 0.07, 7: 5,021,568 bytes
    LZPM 0.07, 8: 5,019,052 bytes
    LZPM 0.07, 9: 5,018,005 bytes


  7. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Experimenting with an improved Lazy Matching I think to myself - "Why just not to add a Flexible Parsing?", and added a Flexible Parsing!

    Some results:

    world95.txt: 587,164 bytes
    english.dic: 939,455 bytes
    rafale.bmp: 1,078,642 bytes
    acrord32.exe: 1,624,094 bytes
    fp.log: 683,985 bytes

    ENWIK8: 28,466,551 bytes
    ENWIK9: 247,100,377 bytes


  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Added extra tricks and flicks:

    world95.txt: 585,659 bytes
    fp.log: 656,168 bytes

    ENWIK8: 28,385,939 bytes
    ENWIK9: 246,426,198 bytes

    This version mostly and finally outperforms QUAD and if take into account QUAD's order-2-0 PPM... So, this new LZPM has a quite powerful LZ-layer and an extremely simple decoder - just order-1 arithmetic and a simplest LZ-related stuff. Due to this fact, LZPM is notable faster than QUAD at decompression (and needs less memory for decompression) and now LZPM is even stronger...

    However, LZPM is slower at compression and uses a larger amount of memory during compression. Anyway, the goal is FAST decompression with LOWEST memory usage (in modern terms).

    And the main thing - as with all my projects I keep all things as simple as possible. The complete source of LZPM (including encoder, decoder and command line interface) is ~8 KB long! Compare it with LZMA... This is a king of my compressors, in other words!


  9. #9
    Member
    Join Date
    Jun 2008
    Location
    G
    Posts
    372
    Thanks
    26
    Thanked 22 Times in 15 Posts
    very good work encode

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Thanks!

    The release is soon - I think within 1 to 3 weeks - just need more testing. Who knows, maybe I add something additional. Also, have an idea to create a small homepage at encode.su.

    And finally, some day release it at SouceForge.net as follower of the QUAD!

  11. #11
    Member
    Join Date
    Jun 2008
    Location
    G
    Posts
    372
    Thanks
    26
    Thanked 22 Times in 15 Posts
    Cool maybe later, I can use this "small code" to entry the compressionsworld. =)

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Finally, I compared the performance of the QUAD and the LZPM on my new Core 2 Duo.

    Surprisingly, QUAD decompresses a little bit faster than LZPM! Thanks to a large CPU cache. I think that even the whole QUAD's PPM model can fit into it's cache!

    However, I made a few Dual Core related optimizations on LZPM. So, now LZPM decodes approximately at the same speed on this CPU.

    LZPM uses bit-oriented encoder, QUAD uses byte-oriented - this can make a sense. Also QUAD's PPM is ideally fits in modern CPUs (linear freq search runs faster, etc). Anyway, on other processors with a smaller cache, including Pentium 4 LZPM's approach runs faster anyway.

    The compression speed of the new LZPM with an optimized parsing is about twice slower as QUAD's one with the "-x" option. On text files LZPM may be even 3 to 4 times slower than QUAD (Hash chains property).

    Concluding, new LZPM is already:
    + Tuned for use in modern Intel Core 2 Duo processor
    + Removed any file size limits, and tested on HUGE files (yesterday I compressed 50 GB file set! and verified the decompression...)
    + Improved compression


  13. #13
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by encode
    Surprisingly, QUAD decompresses a little bit faster than LZPM! Thanks to a large CPU cache. I think that even the whole QUADs PPM model can fit into its cache!
    It will be interesting to test performance on my Sempron 2400+ machine with its tiny (64K+64K+256K) CPU cache.

  14. #14
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    To be honest, the decoder stayed untouched. I only changed the "uncompressed size" field from 32-bit integer to 64-bit - for huge files support. Thus if LZPM 0.06 runs on your PC faster than QUAD, LZPM 0.07 also must decompress faster.

    Also the difference in compiler/libraries.
    LZPM 0.06 compiled using Visual C++ 6.0
    LZPM 0.07 compiled using Visual C++ 2005 SP1 (The latest)

    VC 2005 contains a handy functions to work with huge files (64-bit sizes). Also believe that CRT is more secure and safe from the latest VC.

    In addition, I compared the brand new LZPM 0.07 with original ROLZ2 (mcomp -mf). Version by version LZPM closes the gap in performance. Some stats:
    LZPM is two times faster than ROLZ2 in normal mode
    LZPM has a little bit higher decompression speed
    LZPM's compression is pretty close to the ROLZ2 with fast mode
    LZPM is much more stable on already compressed data like A10.jpg, even compared to the ROLZ3


  15. #15
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    It's my mistake! Of course LZPM is faster than QUAD in all cases!!!

    The catch in new VC 2005!

    And finally I found why VC 2005 works slower!!!

    Do you see, the VC 2005 has a Multi-threaded CRT only.

    And if you call something like

    fgetc() or fputc(), each function will perform some thread locking to prevent other threads access to some memory structures. And as a result - degraded performance.

    So, under VC 2005 use:

    _fgetc_nolock()
    _fputc_nolock()
    _fread_nolock()
    _fwrite_nolock()
    etc.

    These functions are for the Single threaded apps like my console compressor and they work FASTER!!!


  16. #16
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Can't wait!

  17. #17
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    By the way, here also is a gentle solution:

    #define _CRT_DISABLE_PERFCRIT_LOCKS

    Remarks

    Defining this symbol can improve performance in single-threaded I/O-bound programs by forcing all I/O operations to assume a single-threaded I/O model. For more information, see Multithreaded Libraries Performance.
    Defining _CRT_DISABLE_PERFCRIT_LOCKS forces all I/O operations to assume a single-threaded I/O model and use the _nolock forms of the functions. This allows highly I/O-based single-threaded applications to get better performance.

  18. #18
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,503
    Thanks
    741
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by LovePimple
    It will be interesting to test performance on my Sempron 2400+ machine with its tiny (64K+64K+256K) CPU cache.
    if its tiny, how you will call my Duron caches?

  19. #19
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Bulat Ziganshin
    if its tiny, how you will call my Duron caches?
    I didnt know you had a Duron processor. I thought you had a 1GHz Intel Pentium 4 processor.

Similar Threads

  1. LZPM 1.00 overview
    By encode in forum Forum Archive
    Replies: 21
    Last Post: 6th June 2007, 02:11
  2. lzpm 0.03 overview
    By encode in forum Forum Archive
    Replies: 3
    Last Post: 28th April 2007, 22:16
  3. lzpm overview
    By encode in forum Forum Archive
    Replies: 4
    Last Post: 14th April 2007, 23:30
  4. QUAD 1.11 overview
    By encode in forum Forum Archive
    Replies: 45
    Last Post: 1st April 2007, 22:36
  5. PIM 1.50 overview
    By encode in forum Forum Archive
    Replies: 10
    Last Post: 30th March 2007, 13:05

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •