Results 1 to 4 of 4

Thread: LZJody

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Jul 2013
    United States
    Thanked 140 Times in 69 Posts


    I just came across a couple compression libraries I hadn't heard of, and AFAIK haven't been discussed here. One of them is LZJody.

    Copied from the README:

    This code compresses and decompresses a data stream using a combination of compression techniques that are optimized for compressing disk image data.

    Compression methods used by this program include:

    • Run-length encoding (RLE), packing long repetitions of a single byte value into a short value:length pair
    • Lempel-Ziv (dictionary-based) compression with optimized searching
    • Sequential increment compression, where 8-, 16-, and 32-bit values that are incremented by 1 are converted to a pair consisting of an inital value and a count
    • Byte plane transformation, putting bytes at specific intervals together to allow compression of some forms of otherwise incompressible data. This is performed on otherwise incompressible data to see if it can be arranged differently to produce a compressible pattern.
    Interestingly, you can only feed it 4096 bytes at a time. It wouldn't be hard to create a file format similar to snappy-framed to string together multiple <= 4096 byte fragments for compressing larger pieces of data.

    It is licensed under the GPLv2.

  2. #2
    Join Date
    Feb 2015
    United Kingdom
    Thanked 80 Times in 47 Posts
    I just tested the "incompressible" file provided, it is easily compressed with any method other than huffman... So it isn't actually incompressible. That said, it's entropy is a tad under 7 bits per byte which is quite high so I guess it'd classify as somewhat incompressible.
    Upon closer inspection over 26% of this file is actually comprised of byte values 92-101 (10 different bytes). 40% of the bulk of the file is 27 other bytes and the remaining 34% is a mix of all other possible bytes of a near equal yet very low distribution.

    I'm not quite sure what file would actually be an appropriate test to demonstrate the benefit of his Byte plane transform as this isn't a very good "incompressible" file considering it is compressed to about 35% its original size using CM (fp8v3). :/
    Last edited by Lucas; 2nd September 2015 at 08:34.

  3. #3
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Shenzhen, China
    Thanked 62 Times in 31 Posts
    test with enwik8:
    e: 100000000 -> 92340743, 11s
    d: 100000000 <- 92340743, 4s

  4. #4
    Join Date
    Nov 2018
    North Carolina, USA
    Thanked 0 Times in 0 Posts
    I know I'm three years late to this thread, but I thought I'd explain a bit. I chose the "incompressible" data block because it was in a disk image and the algorithm didn't compress it at all, so I could use it to easily test behavior when compression of a block failed. lzjody is just my attempt at playing with compression ideas; I am not a data compression expert by any means. The more interesting bit that drove me to respond is the questions about the utility of the byte plane transform. This idea came about because I noticed a pattern in several data blocks where 32-bit pieces of data would do something like increment one of the bytes but not the other three; I realized that 3/4 of such data would be very efficiently compressible with RLE if the identical bytes could be transformed to sit side-by-side. The byte plane transform on a full block of such 32-bit incremental data (regardless of the particular byte incremented) allows 3/4 compression as RLE and 1/4 compression as seq8 if the incrementing byte sequence increments by one.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts