Results 1 to 4 of 4

Thread: lzpm 0.03 overview

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Okay! Currently I'm playing with hashing and binary tree.

    Check out some testing results (DEBUG build of lzpm 0.03)

    Layout:
    <hash size, bits>-<number of hash bins>: <compressed size> <time>

    TraktorDJStudio3.exe (29,124,024 bytes)
    18-32: 5,950,354 bytes, 8.844 sec
    17-64: 5,936,534 bytes, 13.047 sec
    16-128: 5,929,646 bytes, 20.094 sec

    world95.txt (2,988,578 bytes)
    18-32: 626,067 bytes, 0.969 sec
    17-64: 621,738 bytes, 1.250 sec
    16-128: 619,408 bytes, 1.671 sec

    worms2.tar (17,097,728 bytes)
    18-32: 8,720,491 bytes, 15.937 sec
    17-64: 8,718,574 bytes, 25.344 sec
    16-128: 8,717,675 bytes, 43.125 sec

    3200.txt (16,013,962 bytes)
    18-32: 5,179,135 bytes, 7.968 sec
    17-64: 5,136,275 bytes, 11.125 sec
    16-128: 5,119,991 bytes, 16.437 sec

    Note that current lzpm 0.02 uses 16-bit hash and 128 hash bins - now you can see how we can speed-up the compression!

    Anyway, I'm seriously thinking about binary tree - it must be more efficient than hashing in this case.

    The hashing has the worst case - when we match to each byte (compressed or not so well compressible data). In addition, with ROLZ we must perform a larger number of match searches compared to the LZ77. Binary tree with lzpm should work better...

    Also I'm thinking to myself, how Malcolm perform the same match search in his ROLZ2/ROLZ3. Playing with WinRK I've found that:
    The actual size of additional memory requirements for compression is depent upon:
    + Dictionary size
    + Largest Optimised Match (0 or 1 and larger)

    All clear with dictionary size.
    About Largest Optimised Match:
    0 - disables optimizing (Optimal Parsing)
    values of 1 and higher enables.

    With Optimal Parsing enabled, ROLZ uses a larger amount of memory. Additionally to the significant speed impact...

  2. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Another cool stuff is Hash Chains! With reduced hash chain length and large hash this approach works notable faster than current hashtable with multiple bins. However, in my implementation hash chaining needs 8*DICTSIZE + 8 MB.

    16 MB dictionary = 128 + 8 = 136 MB

  3. #3
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    hash chains should be not faster than quad-style hash. try to use rolling scheme (which is used for decoding in quad)

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    LZPM is not QUAD! And with LZPM newly created hash chaining scheme works faster! (The release is soon!)

Similar Threads

  1. lzpm 0.07 overview
    By encode in forum Forum Archive
    Replies: 18
    Last Post: 7th August 2007, 03:08
  2. LZPM 1.00 overview
    By encode in forum Forum Archive
    Replies: 21
    Last Post: 6th June 2007, 02:11
  3. lzpm overview
    By encode in forum Forum Archive
    Replies: 4
    Last Post: 14th April 2007, 23:30
  4. QUAD 1.11 overview
    By encode in forum Forum Archive
    Replies: 45
    Last Post: 1st April 2007, 22:36
  5. PIM 1.50 overview
    By encode in forum Forum Archive
    Replies: 10
    Last Post: 30th March 2007, 13:05

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •