Results 1 to 15 of 15

Thread: LZW, LZMW and LZAP comparison

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts

    Arrow LZW, LZMW and LZAP comparison

    After testing LZW with a huge dictionary I decided to additionally test some LZW modifications: LZMW and LZAP. All modifications are identical - except dictionary handling - thus we may see the real difference of performance on various data types. So, LZ-codes (or dictionary pointers) are coded using flat codes, all files coded with no dictionary reset.
    book1
    LZW: 314,543 bytes
    LZMW: 317,227 bytes
    LZAP: 306,942 bytes

    calgary.tar
    LZW: 1,307,956 bytes
    LZMW: 1,243,481 bytes
    LZAP: 1,221,904 bytes

    world95.txt
    LZW: 947,113 bytes
    LZMW: 764,013 bytes
    LZAP: 751,613 bytes

    fp.log
    LZW: 2,009,356 bytes
    LZMW: 1,074,090 bytes
    LZAP: 1,147,196 bytes

    english.dic
    LZW: 1,607,939 bytes
    LZMW: 1,815,974 bytes
    LZAP: 1,646,121 bytes

    A10.jpg
    LZW: 990,614 bytes
    LZMW: 1,081,968 bytes
    LZAP: 1,030,322 bytes

    3200.txt
    LZW: 5,642,451 bytes
    LZMW: 5,607,396 bytes
    LZAP: 5,475,965 bytes

    pht.psd
    LZW: 4,880,558 bytes
    LZMW: 3,240,734 bytes
    LZAP: 2,931,192 bytes

  2. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Anyway, LZ77 is supreme. Especially if we talk about LZ77/LZSS with S&S or even Optimal Parsing. Even simple scheme with byte-aligned output (no BitIO, Huffman or Arithmetic coding) may easily beat LZW even on text files...

  3. #3
    Member
    Join Date
    May 2008
    Location
    brazil
    Posts
    163
    Thanks
    0
    Thanked 3 Times in 3 Posts
    Quote Originally Posted by encode View Post
    Anyway, LZ77 is supreme. Especially if we talk about LZ77/LZSS with S&S or even Optimal Parsing. Even simple scheme with byte-aligned output (no BitIO, Huffman or Arithmetic coding) may easily beat LZW even on text files...

    It's possilble to use Some LZT Techiniques to test new model of LZ algos?

    http://compgt.googlepages.com/lz77

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by Modeling for Text Compression
    3.4.9 LZT
    LZT [Tischer 1987] is based on LZC. The
    main difference is that once the dictionary
    is full, space is made for new phrases by
    discarding the least recently used phrase
    (LRU replacement). This is performed efficiently
    by maintaining phrases in a selforganizing
    list indexed by a hash table. The
    list is designed so that a phrase can be
    replaced in a small, bounded number of
    pointer operations. Because of the extra
    housekeeping, this algorithm is a little
    slower than LZC, but the more sophisticated
    choice of phrases in the dictionary
    means that it can achieve the same
    compression ratio with less memory.
    LZT also codes phrase numbers slightly
    more efficiently than LZC by using a
    slightly better method of phasing the binary
    encoding. (This improvement could
    also be applied to several other LZ algorithms.)
    A little extra effort is required of
    the encoder and decoder, but it is insignificant
    compared with the task of searching
    and maintaining the LRU list. The second
    variation of Miller and Wegman [1984] is
    an independent discovery of LZT.
    Like you see in my implementations listed above the dictionary reset even not performed. Well, theoretically, it is possible to implement LZW/LZMW/LZAP with such LRU dictionary...

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Phasing the binary encoding or phased-in codes may be applied, although I never tested such trick...

  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Some paper for interested audience:
    Attached Files Attached Files

  7. #7
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Just to show the difference I moded Mark Nelsons old LZW.C to include a larger dictionary and then used the two level codes just to see the difference. It was fast and dirty but
    FROM TABLE ABOVE
    book1
    LZW: 314,543 bytes
    LZMW: 317,227 bytes
    LZAP: 306,942 bytes

    calgary.tar
    LZW: 1,307,956 bytes
    LZMW: 1,243,481 bytes
    LZAP: 1,221,904 bytes

    USING MODIFED MARKS FOR 2 LEVEL HUFFMAN
    book1
    LZW: 306813 bytes

    calgary.tar
    LWZ: 1266015 bytes
    So you can see the difference the two level huffman, when code not a power of 2 it's not very much. I might do further modes to make it bijective I might not for 256 roots. I might first try to make it bijective for 4 roots so that it could work with DNA files. However not sure I can handle a large dictionary.
    Does any one have short DNA files using ACGT and have data comparing the various compresses for it. I think bijective LZW would squeeze something out of them.

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Can you describe your two level codes?

    BTW, check out some results for LZW with my spacial bit-wise arithmetic encoder:
    Code:
    book1 -> 304,849 bytes
    
    world95.txt -> 897,756 bytes
    
    3200.txt -> 5,494,101 bytes
    
    ENWIK8 -> 32,601,555 bytes

  9. #9
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Quote Originally Posted by encode View Post
    Can you describe your two level codes?

    BTW, check out some results for LZW with my spacial bit-wise arithmetic encoder:
    Code:
    book1 -> 304,849 bytes
    
    world95.txt -> 897,756 bytes
    
    3200.txt -> 5,494,101 bytes
    
    ENWIK8 -> 32,601,555 bytes
    Well it can't compare with arithmetic as your book1 shows.
    But I think its almost as fast as writting your one lever code.
    Here is how its works suppose
    you need to code one of 6 values. you could code flat meaning

    1 000
    2 001
    3 010
    4 011
    5 100
    6 101
    7 110 not allowed
    8 111 not allowed

    gives you 3 bits/symbol

    2 level huffman

    1 00
    2 01
    3 100
    4 101
    5 110
    6 111

    gives you 2.666666... bits/symbol

    While arimetic gives roughly

    log(6) / log(2) = 2.5849625 bits/symbol

    So that is what I mean by two level huffman
    its easy to write out but on read you read one bit less than
    necessary in case aboue note you read 2 bits you
    have 00 01 10 11 then depening on how you sliice it you
    decide if you need to read 1 more bit in case above I would see
    if value greater then 01 if it is read nect bit.

    Every time I code it I do it differently but if you look at
    my web page

    http://bijective.dogma.net/compress2vh.htm

    and the code for Vh1. you can see how I handled the new
    symbol after the escapte I guess Vitter used used the full
    8 bits I used a two level huffman bases on which symbol was
    new out of how many are left. The code there is only slightly
    different than Karls code of the Vitter huffman but it is fully bijective
    and many people seem to use it. Note my LZW for 256 roots is
    not bijective its a copy of marks here I use the two level huffman
    to write out code instead of level huffman. I think the gain in going
    to 2 level makes I good cheap improvement but if you want more
    of course arithmetic is usually the way to go.

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    OK, thanks for info!

  11. #11
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Quote Originally Posted by encode View Post

    ENWIK8:
    LZW: 34,423,356 bytes
    ZIP: 36,422,375 bytes
    COMPRESS: 46,247,947 bytes


    BTW, check out some results for LZW with my spacial bit-wise arithmetic encoder:
    Code:
    ENWIK8 -> 32,601,555 bytes
    I got 24 bit 2 level plain huffman LZW to ran on my PC
    compressing then decompressing and comparing start file
    to decompressed file for the file ENWIK8

    24 bits 33,734,145
    23 bits 34,048,092
    21 bits 36,310,503

    Thats as large as I can run but I think the 24 bit table did
    not fill up with enwik8. Actaully for the kind of stuff I write
    I don't care about speed but these where quite fast compared
    to other things I have tested with this file.

  12. #12
    Member
    Join Date
    Jul 2006
    Location
    US
    Posts
    39
    Thanks
    26
    Thanked 1 Time in 1 Post
    Quote Originally Posted by encode View Post
    After testing LZW with a huge dictionary I decided to additionally test some LZW modifications: LZMW and LZAP.
    Is this LZAP by Mike Goldman? If so, which version were you running?

  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by spark View Post
    Is this LZAP by Mike Goldman? If so, which version were you running?
    Nope. This is my own implementation of LZAP compression algorithm.

  14. #14
    Member
    Join Date
    Jun 2017
    Location
    Bangladesh
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    How many versions of LZAP available ? Can You give some link about latest LZAP ?

  15. #15
    Member nikkho's Avatar
    Join Date
    Jul 2011
    Location
    Spain
    Posts
    546
    Thanks
    219
    Thanked 164 Times in 105 Posts
    Quote Originally Posted by imransuet View Post
    How many versions of LZAP available ? Can You give some link about latest LZAP ?
    Please, do not duplicate posts: https://encode.su/threads/2799-What-...3561#post53561

Similar Threads

  1. Comparison of lossless PNG compression tools
    By Surfer in forum Data Compression
    Replies: 54
    Last Post: 19th September 2011, 23:58
  2. exe prefilter quick comparison
    By evg in forum Data Compression
    Replies: 7
    Last Post: 23rd May 2009, 17:20
  3. Comparison of the recent CM demo coders
    By Shelwien in forum Data Compression
    Replies: 38
    Last Post: 13th June 2008, 14:21
  4. Original paper on LZMW
    By encode in forum Forum Archive
    Replies: 3
    Last Post: 9th February 2008, 14:34
  5. LZW v0.1 is here!
    By encode in forum Forum Archive
    Replies: 20
    Last Post: 2nd February 2008, 14:46

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •