Results 1 to 12 of 12

Thread: A nooblike ROLZ compressor

  1. #1
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    160
    Thanks
    18
    Thanked 56 Times in 27 Posts

    Talking A nooblike ROLZ compressor

    since I'm new to ROLZ, the time performance is not good at all. But it seems to make better compression ratio the a normal LZ77 compressor!

    Code:
    world95.txt 3005020 => 557787
    bible.txt   4047392 => 811732
    fp.log     20617071 => 681905
    http://comprox.googlecode.com/files/...z-0.1.0.tar.gz

  2. #2
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    873
    Thanks
    462
    Thanked 175 Times in 85 Posts
    For compiling, my MinGW needs crblib, but the one provided by Charles Bloom doesn't seem to work:

    dpcm.c:63:24: schwerwiegender Fehler: crbinc/inc.h: No such file or directory
    Kompilierung beendet.
    imppm.c:8:24: schwerwiegender Fehler: crbinc/inc.h: No such file or directory
    Kompilierung beendet.

    C:\MinGW\bin>gcc -O3 *.c
    In file included from dpcm.c:63:0:
    c:\mingw\bin\../lib/gcc/mingw32/4.6.2/../../../../include/crbinc/inc.h:23:28:
    hwerwiegender Fehler: crblib/memutil.h: No such file or directory
    Kompilierung beendet.
    In file included from imppm.c:8:0:
    c:\mingw\bin\../lib/gcc/mingw32/4.6.2/../../../../include/crbinc/inc.h:23:28:
    hwerwiegender Fehler: crblib/memutil.h: No such file or directory
    Kompilierung beendet.

  3. #3
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I compiled with no problem using "gcc -O3 *.c" with MinGW 4.6.1 (Win32). Test results:
    http://mattmahoney.net/dc/text.html#2158
    http://mattmahoney.net/dc/silesia.html

    I guess you might need pthreadGC2.dll to run.

  4. #4
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    873
    Thanks
    462
    Thanked 175 Times in 85 Posts
    I have also compiled it now. Thanks for the hint.

  5. #5
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    160
    Thanks
    18
    Thanked 56 Times in 27 Posts
    Quote Originally Posted by Matt Mahoney View Post
    I compiled with no problem using "gcc -O3 *.c" with MinGW 4.6.1 (Win32). Test results:
    http://mattmahoney.net/dc/text.html#2158
    http://mattmahoney.net/dc/silesia.html

    I guess you might need pthreadGC2.dll to run.
    Thanks for the benchmark.
    I use native threading APIs on windows, so pthreadGC2.dll is no longer needed. (I successfully compiled and ran it with mingw32 and wine under linux)

    I found that all my programs have bad performance on decompression. I think I'm playing with too long context for decoding literals.
    Should I pay more attention on optimal parsing and reduce context length?

  6. #6
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I didn't look at the source, but ROLZ decompresses slower than LZ77 because the decompresser has to maintain an index.

  7. #7
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @RichSelian:

    "I use native threading APIs on windows, so pthreadGC2.dll is no longer needed."

    Can you please post a win32 - binary here in the forum?

    i want to compare the program with the "open source program BALZ (ROLZ-compression) from encode"

    the compression seems to be not bad ..

    "bad performance on decompression"

    maybe because your compression-algorithm can use 2 cores but your decompression-algorithm can use only 1 core?

    best regards
    Joerg

  8. #8
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Compiled with gcc -O3 *c in MinGW 4.6.1 for Win32, packed with upx.
    Attached Files Attached Files

  9. #9
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    160
    Thanks
    18
    Thanked 56 Times in 27 Posts
    I use o2-o1 model to encode literals, it's the main reason that make decompression slow. but for good compression ratio it is necessary.
    I don't understand why some LZ compressors (like xz) can make as good compression ratio while they are using o1 model?

  10. #10
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    It seems like high order modeling of literals is not needed because they would be coded as matches instead. Also, some algorithms like LZMA use literal exclusion after matches. The first byte after a match would be poorly predicted by a model or else it would have extended the match, so it XORs it with the predicted byte.

  11. #11
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    160
    Thanks
    18
    Thanked 56 Times in 27 Posts
    Quote Originally Posted by Matt Mahoney View Post
    It seems like high order modeling of literals is not needed because they would be coded as matches instead. Also, some algorithms like LZMA use literal exclusion after matches. The first byte after a match would be poorly predicted by a model or else it would have extended the match, so it XORs it with the predicted byte.
    That means you have to search for len=3 matches? But will replacing 3 literals with a pos/len pair really help the compression ratio? (Maybe you are using some optimal parsing skills to limit match pos?)

  12. #12
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Length 3 matches aren't worth coding because these will be common even in random files with a 16 MB window, resulting in no compression. What I mean is that if you use a context models to predict literals, they will usually make a wrong prediction after a match. Your model needs to account for this.

Similar Threads

  1. BALZ - An Open-Source ROLZ-based compressor
    By encode in forum Data Compression
    Replies: 60
    Last Post: 6th March 2015, 16:47
  2. ROLZ and Search Trees ?
    By Guzzo in forum Data Compression
    Replies: 5
    Last Post: 1st August 2012, 00:03
  3. xp a rolz compressor
    By pmcontext in forum Data Compression
    Replies: 40
    Last Post: 9th December 2010, 09:04
  4. ROLZ explanation?
    By Trixter in forum Data Compression
    Replies: 5
    Last Post: 10th June 2008, 18:24
  5. A small article on ROLZ (Russian)
    By encode in forum Forum Archive
    Replies: 21
    Last Post: 29th April 2007, 15:18

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •