Page 4 of 5 FirstFirst ... 2345 LastLast
Results 91 to 120 of 130

Thread: LZPM's future

  1. #91
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    yep, that would be good, but branching will decrease speed. if you're aiming for highest decompression speed then current scheme should be better.

    if speed is not an issue and you can afford decompression speed only about two times higher than lzx then you can go to gamma encoding.

    gamma encoding allows unlimited match leghts and unlimited offsets. with eg. 16 mbytes sliding window you can beat lzx in compression ratio and decompression speed.

  2. #92
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Quote Originally Posted by donkey7
    if speed is not an issue and you can afford decompression speed only about two times higher than lzx then you can go to gamma encoding.
    Currently the measure is - given compression improvement VS decoder complexity. If compression gain worth an additional complexity its OK. But if we getting only small gain at the cost of even negligible decoder complexity - its not worth it. And right now Im starting to test current modifications, results will be posted ASAP...

  3. #93
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    OK, results:

    First variant with fixed offset length. MINMATCH is four. MAXMATCH halved each time dictionary size doubles, nothing special. LZ codes are fits in three bytes.

    acrord32.exe
    64K - 2,198,834 bytes
    128K - 2,147,925 bytes
    256K - 2,106,636 bytes
    512K - 2,081,122 bytes

    world95.txt
    64K - 968,257 bytes
    128K - 868,034 bytes
    256K - 797,370 bytes
    512K - 774,214 bytes

    calgary.tar
    64K - 1,247,541 bytes
    128K - 1,182,915 bytes
    256K - 1,144,049 bytes
    512K - 1,133,915 bytes

    Second variant has a variable length offset. We keep an extra bit to indicate - read a 16 or 8-bit as offset. This bit kept in a match length field, which halves MAXMATCH, in addition extra match offset bits are also stored in the first byte of an LZ code, which again halves MAXMATCH. MINMATCH is equal to three, this thing slows down the compression further. The cool part is during optimal parsing we really see how many bytes needed to code each choice and choose the optimal.

    acrord32.exe
    64K - 2,019,001 bytes
    128K - 1,957,766 bytes
    256K - 1,912,172 bytes

    world95.txt
    64K - 932,598 bytes
    128K - 840,694 bytes
    256K - 802,700 bytes

    calgary.tar
    64K - 1,195,779 bytes
    128K - 1,127,058 bytes
    256K - 1,089,227 bytes

    Currently I'm for 128K variable offset scheme, since in addition the decompressor is still very simple.

  4. #94
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Results for fixed offset length LZSS, 256K window:

    ENWIK8: 38,424,599 bytes
    ENWIK9: 339,113,129 bytes

    LZSS completely outperform LZW...

  5. #95
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Some SFX executables to play with. LZSS with 64 KB window and variable offset length (bytewise) was used, since it's better for executables.

    acrord32.exe
    test.exe
    mptrack.exe
    hexedit.exe


  6. #96
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    I will try Gamma codes. I just looked at JCALG and found how it's easy to decode:
    Code:
     
    while (getbit()) 
      c+=(c+getbit());

    i.e. we mix control bits and actual bits from value.

    I will try the followed scheme:

    0/1 - literal/match

    literal is stored unencoded (8-bits)

    match length is gamma encoded

    the low 8-bits of an offset is stored (8-bits)

    higher bits of an offset is gamma encoded. If we use a 64K dictionary, we will gamma encode a high byte.


  7. #97
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,008
    Thanks
    301
    Thanked 1,324 Times in 757 Posts
    Why not just use blockwise static huffman codes?

  8. #98
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    btw:
    storing 8-bits unencoded and encoding high bits is equal to exponential golomb coding of order 8
    http://en.wikipedia.org/wiki/Exponential-Golomb_coding

  9. #99
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Quote Originally Posted by Shelwien
    Why not just use blockwise static huffman codes?
    Huffman is more complex to decode.

    Simultaneously, Im working on LZ77-based LZPM - trying to adapt current S&S parsing to cope with adaptive encodings. Generally speaking, current LZ77 is notable weaker than my ROLZ - however, it already has a higher binary compression, faster decompression and a smaller amount of memory. Altought with S&S parsing it is dead slow. Another variant is to use lazy matching. In this case LZPM will have a fast enough compression and a very fast decompression - like a heavy armed Deflate.

  10. #100
    Member
    Join Date
    Feb 2008
    Posts
    31
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by encode
    Huffman is more complex to decode.
    Huffman decoding is nearly fastest between all bit-aligned codes. Decoding procedure can be reduced to < 2 bit-aligned readings in average.

  11. #101
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    Quote Originally Posted by Dmitry Shkarin
    Huffman decoding is nearly fastest between all bit-aligned codes. Decoding procedure can be reduced to < 2 bit-aligned readings in average.
    how? using automaton? what about the size of auxiliary structures?

  12. #102
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    By the way, nowadays, we have a fast enough arithmetic compression - FPAQ0P-like for example. Which has a greater performance at the cost of some time, which is negligible on modern PCs.
    IMHO, Huffman - the thing is out...

  13. #103
    Member
    Join Date
    Feb 2008
    Posts
    31
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by donkey7
    how?
    Suppose We have calculated lengths of Huffman codes and minimal codelength is N then We can read N bits and to stop for most probable symbols and to repeat reading for other symbols. Decoding procedure will be something similar to:

    extern UINT GetBits(UINT NBits);

    struct HUFF_UNPACK_ITEM {
    INT_TYPE NextTable, BitsToRead;
    } Table[2*ALPHABET_SIZE];

    inline UINT DecodeSymbol()
    {
    const HUFF_UNPACK_ITEM* ptr=Table;
    do {
    ptr += ptr->NextTable+GetBits(ptr->BitsToRead);
    } while (ptr->BitsToRead != 0);
    return ptr->NextTable;
    }

    Quote Originally Posted by donkey7
    using automaton?
    What is it?

    Quote Originally Posted by donkey7
    what about the size of auxiliary structures?
    4 bytes per tree node, 2 tree nodes per symbol of alphabet for alphabrt size < 2^16.

    AFAIR, I have described it with code sources in ru.compress.

  14. #104
    Member
    Join Date
    Feb 2008
    Posts
    31
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by encode
    IMHO, Huffman - the thing is out...
    No, of course.
    1. For fast online coding, We need to use something similar to Golomb-Rice coding.
    2. For fast and more precise offline coding, We need Huffman coding.
    3. For precise and online coding, We need arithmetic.

  15. #105
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,008
    Thanks
    301
    Thanked 1,324 Times in 757 Posts
    > By the way, nowadays, we have a fast enough arithmetic compression - FPAQ0P-like
    > for example.

    Hmm... I'd finally tested my "aridemo" coder from 2001 which is based on a full RC (not binary)
    and Shkarin's unary model, and, kinda surprisingly, it really appears to be somewhat
    slower than fpaq0p - especially in decoding.
    Well, it seems that worse speed is caused by counter rescalings (Shkarin's models mostly
    use this method for nonstationary sources - counters are divided by 2 when some threshold's reached),
    and worse compression can be explained by that too - modern counters are clearly superior to simple
    increment with occasional rescaling... and worse compression caused additional slowdown too.
    Wonder if I should really write a new optimized order0 to support my statements
    Well, that kind of difference surely can be covered by proper optimization...

    http://shelwien.googlepages.com/o0c.htm
    http://shelwien.googlepages.com/o0c.rar

    > Which has a greater performance at the cost of some time, which is negligible on modern PCs.
    > IMHO, Huffman - the thing is out...

    Alas, there's no good blockwise huffman implementation to test right away, but still it should
    be significantly faster than a coder with 8 multiplications and 16 shifts and other things per
    byte - imho comparison with "memcpy model" clearly shows that.

  16. #106
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    decoding of number coded by exponential golomb codes order n with maximum 8 + n bits value.

    1. get 8 bits (don't skip, ie. move pointer in bitstream!!!) from bitstream.
    2. do a bsr on returned value (one assmebly instruction!!!) to determine length of number. bsr would give use value of k - n where k is number of bits in coded value
    3. skip k - n bits (previously analysed).
    4. read (get and skip) k - n bits value.
    5. read n bits and append to value.
    6. that's all!!!

  17. #107
    Member
    Join Date
    Feb 2008
    Posts
    31
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by donkey7
    1. get 8 bits (dont skip, ie. move pointer in bitstream!!!) from bitstream.
    2. do a bsr on returned value (one assmebly instruction!!!) to determine length of number. bsr would give use value of k - n where k is number of bits in coded value
    3. skip k - n bits (previously analysed).
    4. read (get and skip) k - n bits value.
    5. read n bits and append to value.
    6. thats all!!!
    Very complex, compare it with my code sample.

  18. #108
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Quote Originally Posted by donkey7
    decoding of number coded by exponential golomb codes order n with maximum 8 + n bits value.
    Not quite right. Look at the JCALG source, and what Ive posted above. What we need is just read bit by bit:

    1 X 1 X 1 X 0

    X is a bit of an integer, zero marks the end of a value.

  19. #109
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    simpler version:

    1. get 8 bits
    2. do a bsr
    3. read 2 * (k - n) bits to result
    4. read n bits and append to result

    will post assembly version soon

    ilia:
    reading bit by bit is very expensive

  20. #110
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Quote Originally Posted by donkey7
    reading bit by bit is very expensive
    add eax,eax
    jc got_one
    Thats all!

  21. #111
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    encode:
    as you see it creates branches.

    my method (assume that ebp holds stream pointer * 8, ie. position in memory in bits):

    mov ebx,ebp
    shr ebx,3
    mov eax,[ebx]
    bswap eax
    mov ecx,ebp
    and ecx,7
    shl eax,cl
    bsr eax,ecx
    neg ecx
    add ecx,32
    add ecx,ecx
    add ecx,-1
    add ecx,n ; ecx holds size of encoded value
    add ebp,ecx
    neg ecx
    add ecx,32
    shr eax,cl
    add eax, -(1 << n) ; result in eax

    heavily unoptimized encoded value must be <= 25 bits in size (encoded size).

  22. #112
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    slightly optimized version:

    mov ebx,ebp
    shr ebx,3
    mov eax,[ebx]
    bswap eax
    mov ecx,ebp
    and ecx,7
    shl eax,cl
    bsr eax,edx
    lea ecx,[edx*2+31-n]
    shr eax,cl
    add eax,-(1 << n)
    lea edx,[edx*2-n-63]
    sub ebp,edx


    with n = 8 we can put maximum value of 261887 ie 256 kb window. with n = 6 window can have size 512 kb.

  23. #113
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Well, I'm still working on parsing, but check out some results in comparison with LZPM v0.15 with 9 switch:

    pak0.pak (183,997,730 bytes)
    LZPM v0.15: 88,226,941 bytes
    LZPM v0.16: 83,000,690 bytes

    MPTRACK.EXE (1,159,172 bytes)
    LZPM v0.15: 499,254 bytes
    LZPM v0.16: 489,102 bytes

    acrord32.exe (3,870,784 bytes)
    LZPM v0.15: 1,438,130 bytes
    LZPM v0.16: 1,412,815 bytes

    world95.txt (2,988,578 bytes)
    LZPM v0.15: 570,181 bytes
    LZPM v0.16: 611,035 bytes


  24. #114
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    lzpm 0.15 == rolz
    lzpm 0.16 == lz77

    ?

  25. #115
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Yep.

  26. #116
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    The question. Maybe I should keep LZPM as is and release a new separate compression program. Also about large dictionaries. Should I? With larger dictionary compression is slower, but often better, especially on text files. Preferable sizes are from 64K to 1024K. What dictionary size do you prefer? Having said that my LZ77 already slow, so making it very slow with large dictionary is OK. Anyway, the decompression is pretty fast.

  27. #117
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,498
    Thanks
    26
    Thanked 135 Times in 103 Posts
    i suggest a LTCB edition

    ie. ROLZ with 2 bytes (or even 3 bytes - or maybe introduce some switching, ie. keep both ROLZ order 2 and ROLZ order 3 models and write a flag to tell which model is used). design some suffix trees to speed up parsing, but add patterns at every byte (not only on match boundaries). plus order- 1 and order- 2 mixed coders for literals.

    that would be a very slow LZ coder but it's interesting how far could it be pushed on text compression.

  28. #118
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    I dont like the idea about compressors that special made for given file. General purpose compression is the goal. For text files PPM/CM-based compressors are better in any case. LZ77 has advantages on binary data like EXEs, set of different files etc. at the same time giving fastest decompression.

    Quote Originally Posted by donkey7
    that would be a very slow LZ coder but its interesting how far could it be pushed on text compression.
    I dont think that this scheme should be slow. Order-2 literal coder can be fast. Mixed order offset sets for ROLZ may only speed up the search - since we limit the number of offsets available. Basically, with well desined LZ-coders with optimized parsing, 99.9% of time is spent on string searching. For example with my new LZ77, string searching takes nearly all of compression time - determining the shortest path and actual arithmetic compression takes no time...

  29. #119
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Black_Fox, don't keep silence - tell what you think about future lzpm!

  30. #120
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,020
    Thanks
    407
    Thanked 410 Times in 157 Posts
    Hm, well, I just made a new file LZ77 compressor. The release is soon - within one or two weeks. I think this one will be the flagship of all my compressors!

Page 4 of 5 FirstFirst ... 2345 LastLast

Similar Threads

  1. BCM's future
    By encode in forum Data Compression
    Replies: 17
    Last Post: 9th August 2009, 02:00
  2. Future Bandwidth.
    By Tribune in forum The Off-Topic Lounge
    Replies: 9
    Last Post: 10th October 2008, 23:56
  3. ENCODE's music
    By encode in forum Forum Archive
    Replies: 1
    Last Post: 17th October 2006, 00:19
  4. Dwing's UDA v3.00
    By spark in forum Forum Archive
    Replies: 6
    Last Post: 10th August 2006, 11:11
  5. TC - What's next
    By encode in forum Forum Archive
    Replies: 3
    Last Post: 20th June 2006, 18:06

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •