Results 1 to 21 of 21

Thread: Best compressor here?

  1. #1
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts

    Question Best compressor here?

    Hi, I'm a new guy here. I'm interested in data compression on PC since the last year (and as my nickname show, I like PAQ, thnks Mat Mahoney )

    So, as I new here and I am not very good at English (I still learn it), also reading over thousands of tons of topics and posts would be very difficult and tiring, I would like to ask the mastermind - if it is not a problem - to tell me, which is the best comressor program here, posted anywhere there or 'outside'.
    I mean 'Best' JUST in compression ratio. Time and hardware is not a problem for me and not interests me - just the compression ratio, the saved space on my HDD.

    The best which I've been using now is PAQ-PX-PRE (thanks Mat Mahoney again). But as I read, it was created in 2009, and now is 2012.

    Thanks!

    P.S.: Before somebody would ask - I mean a "homogen" compress under 'compression'. So I would like to compress several file-types and homogeneous files > mean in byte-code.
    Last edited by paqfan; 20th January 2012 at 23:15.

  2. #2
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    PAQ has the best compression ratios on general data. For text, try durilca.

    The best compression ratios in the PAQ series generally come from paq8kx. I've uploaded the last version.
    Attached Files Attached Files

  3. #3
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Thank U very much. I will check it and compare it with the PX-PRE one.

    Is the PAQ serial still the best in compression ratio?

    P.S.: Which version is that uploaded one (by U) from the kx-one? The last kx version which I've got is v7.
    Last edited by paqfan; 20th January 2012 at 22:24.

  4. #4
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    PAQ still has the best compression ratios as far as I know. I've benchmarked hundreds of programs and PAQ is almost always on top for ratios.

    The version of paq8kx that I uploaded is a my own 32-bit build of v7.

  5. #5
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    You mean "my own" that you modifited the code?

    I also benchmarked the PAQ8 serial from the nature version to PXPRE, included versions created between 2007 and 2009. But I just tested one 2 MB file, because I do not have many time for benchmarking. Also I'm brandly new at this topic so I think it is better to leave the benchmark-work (probably saying 'waiting' is better ) for the creators and masterminds.

    Thank U againg and thanks for Mat Mahoney for PAQ every time !

    P.S.: Ops... I've had got "Out of memory" during using this paq8-kx-v7 after 120 minutes compressing 2 MB... whahaha... as I said, benhmarking, as Montecuccolini would say needs three things : time, time, and time... XD
    Last edited by paqfan; 20th January 2012 at 23:14.

  6. #6
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    The version of paq8kx that I uploaded decreases the memory usage so that -8 works. I only modified the memory usage and added the -9 option, so the archives it creates should be compatible with paq8kx_v7.

    Also, I'm currently benchmarking paq1-7, paq8(a,g,kx_v7,n,px_v69,px_v69kzu,pxpre,q), paq9a and zpaq with the -5 option on a TAR which has a bitmap, executable and text file. I'll post the results when they're finished.
    Last edited by david_werecat; 20th January 2012 at 23:52.

  7. #7
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    Here's a directory listing for the benchmark:

    Code:
       468,449 bench.store.paq8kx_v7
       471,975 bench.store.paq8px_v69
       472,024 bench.store.paq8pxpre
       477,002 bench.store.paq8q
       490,072 bench.store.paq8n
       493,491 bench.store.fp8
       506,195 bench.store.paq8g
       508,243 bench.store.paq8a
       517,556 bench.store.paq7
       537,719 bench.store.paq6
       543,869 bench.store.zpaq
       548,971 bench.store.paq5
       559,052 bench.store.paq9a
       559,384 bench.store.paq4
       567,363 bench.store.lpaq9m
       572,485 bench.store.paq3
       592,704 bench.store.paq2
       605,217 bench.store.paq1
       747,975 bench.store.paq8px_v69kzu
     2,365,443 bench.store

  8. #8
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Nice So KX-V7 is the best as I can see and PX-v69KZU is the worst. (?)
    I've got PX-V69, but I didn't know that there is a "KZU version" from it. What a suprising!
    There are also funny version-names in the PAQ serial.

    And there are some strange things in this benchmark as I can see : for example, FP8 has a bit high position, but as I knew it is just a training version for amateur programers.
    Also some elder version of PAQ is better than some 8-variant.
    And why paq8px-v69 is so "bad" (I don't mean trully 'bad', U know, PAQ always great at almost one thing

    What a suprising! XD - Thanks a lot!

  9. #9
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Hey!!! There is something "wrong" at my laptop:

    I tested this KX version, but it compressed my little 2 299 056 byte PickUp.exe 12,3 % worse (!!) than PAQ-PX-PRE did it. PAQ-KX-V7 compressed my 2 299 056 byte PickUp.exe to 768 821 byte, but PAQ-PX-PRE did it better to 684 612 byte !?!

    PAQ-PX-PRE specialized for EXE applications? Or what can be the reason ?

  10. #10
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    paq8pxpre is specialized for already compressed data. PickUp.exe must have compressed streams in it, such as if it is an installer. paq8kx is better for general data, whereas paq8pxpre is better for already compressed data. To get a similar effect with paq8kx, try packing the input data into a TAR archive, running it through precomp, then running the precomp file through paq8kx.

  11. #11
    Member Surfer's Avatar
    Join Date
    Mar 2009
    Location
    oren
    Posts
    203
    Thanks
    18
    Thanked 7 Times in 1 Post
    What about Nanozip? Compromise between time and compression ratio.

  12. #12
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    I know about nanozip, I've got the 0.9 version too, sometimes I use it too if I make packs for others, but now I'm interested in the nowaday's top of the compression ratio.

    David WereCat, sorry, but I'm a bit confused now. - Why do I have to make TAR archive. Also, I don't really know how can I make an optimised TAR-archive, ever if I've seen TAR archives sometimes.

    My PickUp.exe is just a simple protecting application which I wrote in GML. It isn't contains any installation pack (-as I think), just the only different betweer other native EXEs and applications/games made written in GML is that they can't be packed with any Exe-Compressor, failing which they won't be able to run. (I don't know the real reason for why)

    So; What is the optimal order which you write? :
    TAR > PAQ-PX-PRE > PAQ-KX-7
    Am I right?
    And the precompressor PAQ-PX-PRE can be used more than once?

    So as I see, precompressor is different from real compressor. (?am I right?) - If I am right, is PAQ-PX-PRE the best precompressor??

  13. #13
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    The precompressor is so that the compressor can get better ratios on special data types. The reason for using TAR is because precomp doesn't automatically archive files, it accepts only a single file as input. TAR is just a way of storing the directory structure so that multiple files can be joined into a single file, so you can use other storage archive types such as SHAR and STORE if you want. You can find precomp here: http://encode.su/threads/1366-Precom...hlight=precomp. The optimal order doesn't use paq8pxpre at all. It uses:
    TAR -> precomp -> paq8kx_v7
    The precompressor should only be run once, since it preprocesses all input data in one pass.

    Of course, different versions of PAQ compress differently so it is important to consider many different filetypes to find which one compresses better. In my tests, paq8kx compressed better but you might get different results based on what type of data you want to compress. paq8px_v69 and paq8pxpre are the best compressors aside from paq8kx in my tests, so if your data benefits from using paq8pxpre, then it would be the best compressor to use.

    Also, as a side note, paq8pxpre is just paq8px joined with precomp. It automatically performs the sequence TAR -> precomp -> paq8px so it's easy to use.
    Last edited by david_werecat; 21st January 2012 at 19:04.

  14. #14
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Oh, thank you! I've been learnt more from it now. Anyway, I saw it before that the two dll next to paqpx-pre are probably precompressor dll-s, one as I remember for jpg ant the other for the other precompressor works; but what precompressor version is used in PAQ-PX-PRE? (as I see in the linked topic by you, there are other versions)

  15. #15
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    paq8pxpre is paq8px_v67 combined with precomp 0.3.7.

  16. #16
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Oh, I see. Thanks. I read over the linked topic and now I understand the main definition of precompressor. And is the main method as simply in precompressors, that it do decompression work??? Anyway, I know it can be hard to write a program for it, I know, but I thought that the theorem of the precompressor is much harder, something like the precompressor tries to rewrite some block of the code, that can be tell the original code loselessly, but can be compressed with better ratio than the original block of code. - I thought that but now I know it is more simple.

  17. #17
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Yea, I tested Schnaader's Precompressor 0.4a and the PAQ-KX-V7 supercompressor. It works for me perfectly. But... er... yes, the time:
    It cost me 15 hours, 25 minutes and 18 second... but no matter, the main and the only important thing for me Now is the compression-ratio.

    Here is my log about the compression of PickUp.exe:

    Jelenleg elért legjobb tömörítési arány: (PickUp.exe) 2299056 bájt > 661 644 bájt
    in EN : [The best archieved compression ratio of the benchmarks still now : : (PickUp.exe) 2299056 byte > 661 644 byte]

    Tömörítési metódusa:
    in EN : [Used method]
    precomp.exe -c- -d50 -intense -brute -s2000 -pdfbmp+ -progonly+ -mjpeg+ -v PickUp.exe
    paq8kx.exe -8 PickUp.pcf

  18. #18
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    571
    Thanks
    219
    Thanked 205 Times in 97 Posts
    Quote Originally Posted by paqfan View Post
    precomp.exe -c- -d50 -intense -brute -s2000 -pdfbmp+ -progonly+ -mjpeg+ -v PickUp.exe
    I'd suggest the following instead:
    Code:
    precomp.exe -c- -intense -pdfbmp+ -progonly+ -mjpeg+ -v PickUp.exe
    About the removed switches:

    "-d50" sets recursion depth to 50. Standard is 10 which I never saw used - even for big ISO files, you barely reach 2 or 3. It doesn't make compression slower, but it's pretty useless.
    "-brute" is a switch to detect streams without header, but since it will try to decompress and recompress very often, it is _very_ slow. "-intense" is good enough and not as slow. Time will go down to seconds instead of hours.
    "-s2000" sets minimal stream size to 2000 bytes which is useful for speeding up "-brute" a bit, but is not good in general as it will lead to small streams (which are quite common in JARs and PDFs) being skipped.

    You might also consider removing "-v" as it just prints some more or less useful debug output.

    If you want to test another precompressor, have a look at Shelwien's reflate (http://encode.su/threads/1399-reflat...e-recompressor), latest version is available here: http://nishi.dreamhosters.com/u/reflate_v0c1.rar - it is only detecting deflate streams at the moment, but similar to Precomp's "-brute" mode without the big speed penalty - it doesn't have special handling for filetypes like PNG, JPG, GIF..., but has better handling for some deflate streams that Precomp skips.
    Last edited by schnaader; 22nd January 2012 at 16:29.
    http://schnaader.info
    Damn kids. They're all alike.

  19. #19
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Thanks very much! Because of my bad english I thought that s2000 would be increase the compression ratio, but I see I was completely false. Thanks for your suggestion, it was usefull for me.

  20. #20
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    "Best compressor" is not a computable problem. The best you can do for any file is the shortest program that outputs that file. But in general it is impossible to know whether you found it or if there might be something better.

    So we use benchmarks. But different benchmarks give different answers. PAQ wins a lot of them but not always. http://mattmahoney.net/dc/text.html

  21. #21
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    32
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Yes, I read a lot of your benchmarks when I started to "research" about compressors with better ratios than the usual compressor programs.
    I know there isn't an universal program with best compression ratio for any file types, ever if the method is context-mixing. I just searching one which is "the best" in most cases ever if it is very slow.
    I'm also trying different ideas for getting better compressors on my own, but usually I'm not very successful - I'm just an amateur-programmer so I'm just trying to learn...
    But as I think, there would be alternative methods to improvise a bit better compression ratios - Such as I heard a notation techiq from musicians that is about reduction on scores and musical-sheets; it is possible in some cases that a musical sheet can be rewritten with less note than the original in a deterministic way. So I think we should try to create a program that first decode a data file into notes (maybe just a conversion from hex-code to duodecimal) and then tries to reduce the file's "notation score"...
    I'm also working on an idea which maybe can be called "after-(re)compression", that is about I try to rewrite an already compressed data - which naturally can't be recompressed simply - in a deterministic way to make it to be able to be compressed again with getting more saved space... anyway, these are not very successful yet; just for home-experiences

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •