Page 1 of 3 123 LastLast
Results 1 to 30 of 88

Thread: PAQ8Q

  1. #1
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts

    PAQ8Q

    PAQ8Q combines two very good branches of PAQ8P. PAQ8P3 was much faster by detecting text and switching off non text models and some other tweaks while being most times not much larger.
    PAQ8PX collected the best models out of all PAQ versions and added or improved many file format detections like TGA, MOD, TIFF, WAV and JPG.

    To realize the maximum compression and smart speed up of p3 there is a new parameter -mM where M is a number from 1 (fastest) to 6 (maximum).
    Default is -m5 and is a strong level. -m6 is much slower and most times is only little better.
    The 6 modes are currently only fully used by text compression. Exe-files and the normal compression are only different in -m1, -m2 to -m4, -m5 and -m6.
    It's easy to add a huge speed up with a small ratio loss for any file suffix.
    Attached Files Attached Files

  2. #2
    Member DARcode's Avatar
    Join Date
    May 2009
    Location
    Genoa, Italy
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Brilliant, can't wait for the LovePimple compile, hopefully with a nice SSE2 build too.

  3. #3
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    958
    Thanks
    570
    Thanked 396 Times in 294 Posts
    @all
    Im impressed the idea to mix different models in one version, becouse it's gives opportunity to keep ability maximum compression and from other hand gives chance to make speeded versions. Paq8p3x_v20 is better than other paq8px versions for my testbed, and I hope paq8q compression level would be similar. Great piece of work!
    According to this idea I have question about implementation of paq8k model which is still most powerful model for lots of files with undefined structure. Is it possible?
    For my testbed paq8k keep the 1st place for almost all files for whom paq isn't use any special model (tiff, bmp, wave, jpg). Sometimes difference is about 3%. I know that paq8k model is 3-6 times slower than paq8px, but maybe it could use model -m7 (or more, or -mx for insane compression level).

    Darek

    p.s. - text detection (UDF model) for some files gives more than 1000 communicates that algorithm found text section of 50-150 bytes. It causes of course visible compression hurt. In my opinion text detection (on this algorithm stage) should be considered to turned off from all modes, or it could be switchable for users.

  4. #4
    Programmer Jan Ondrus's Avatar
    Join Date
    Sep 2008
    Location
    Rychnov nad Kněžnou, Czech Republic
    Posts
    278
    Thanks
    33
    Thanked 137 Times in 49 Posts
    I propose something like this : for -m6 make use all models for all non-image and non-audio blocks ->

    Code:
      if (cmode==6 && filetype!=IMAGE1) {
        sparseModel(m,ismatch,order);
        sparseTextModel(m);
        distanceModel(m);
        recordModel(m);
        wordModel(m);
        indirectModel(m);
        dmcModel(m);
        nestModel(m);
        if (filetype==EXE) exeModel(m);
      } else if (cmode!=1 && filetype!=IMAGE1) {
    ...your advanced model selection here...
    You need to increase maximal number of contexts for Mixer
    Code:
      static Mixer m(800, 3088, 7, 128);
    ->
    Code:
      static Mixer m(810, 3088, 7, 128);
    Test results:
    x.tar paq8q 649604 bytes
    x.tar paq8px 649386 bytes
    x.tar modified_paq8q 649219 bytes

    x.exe paq8q 347592 bytes
    x.exe paq8px 346105 bytes
    x.exe modified_paq8q 345459 bytes

    modified_paq8q is just paq8q with changes described above
    Attached Files Attached Files
    Last edited by Jan Ondrus; 13th May 2009 at 12:10.

  5. #5
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Simon!

    Quote Originally Posted by DARcode View Post
    Brilliant, can't wait for the LovePimple compile, hopefully with a nice SSE2 build too.
    ENJOY!

    EDIT: Attachments removed.

  6. #6
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Quote Originally Posted by Jan Ondrus View Post
    modified_paq8q is just paq8q with changes described above
    Thanks Jan!

    Compiled...

    EDIT: Attachments "paq8q.zip" and "paq8q_sse2.zip" removed.

  7. #7
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    325
    Thanks
    18
    Thanked 6 Times in 5 Posts
    Cheers LP, some quick times over 4 files(8,623,484 bytes) BMP, JPEG, Exe, Txt.

    Code:
    			Time
    Normal			111.19
    Speed			89.31
    SSE_Intel		86.23
    SSE2_Intel		85.63

  8. #8
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks for the feedback Intrinsic! I will release more of these SSE2 builds as long as they continue to be popular with PAQ fans.

  9. #9
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Thank you Jan.
    I am aware that there are many things to improve. At least the default models and models used for EXE files.
    That's only the base which can be improved through research and knowledge

  10. #10
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Kaido aka kaitz improved the TEXT detection in paq8p3. It's now included in paq8q. It detects for example enwik8 as one TEXTUTF8 block (there are some small non ASCII parts).

    Please Darek and Jan, can you confirm that it's better for your files too?

    Second change is to turn text detection off for -m1. Jans modification is in too.
    Last edited by Simon Berger; 13th May 2009 at 17:45. Reason: Removed. See some posts later

  11. #11
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Simon!

    Compiled...

    EDIT: Attachments "paq8q2.zip" and "paq8q2_sse2.zip" removed.

  12. #12
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    I am sorry. There was a bug with UTF8 detection.
    Now all text will be printed only as TEXT.

    I found this because the sse2 build crashed for me. I don't believe it's because of this bug... my own build worked.
    Attached Files Attached Files

  13. #13
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    958
    Thanks
    570
    Thanked 396 Times in 294 Posts
    @Simon,
    my sse version paq8q2 works, but there no any major differences in text recognision. My example testfile:

    k.wad, -6 -m5 scores:
    original: 12'408'292
    paq8q: 2'665'147 (more than 1000 communicates about 50-150bytes long text parts)
    paq8q2: 2'665'026 (similar number of communicates about 50-150bytes text parts)

    If fixing mentioned error could change situation then I'll wait for next compilation, but actual version isn't much better than firs paq8q. If I would count detailed number of text parts then I'll post it for each versions.

    Darek

  14. #14
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    No, the detection is the same. Then it needs more improvements .

    It never could be really good. Your file wasn't affected by this bug because there were only TEXT no TEXTUTF8 detected.
    It maybe the same that Jan found in his language. In german-language we have ?,?,? which would be ascii code F6 for ?. Normal text can only be detected from 33-127.
    Only solution I could think of is to count one or two "bad hits" after a huge text part and many hits behind.
    Last edited by Simon Berger; 13th May 2009 at 18:02.

  15. #15
    Member DARcode's Avatar
    Join Date
    May 2009
    Location
    Genoa, Italy
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Post

    A smallish test:
    Code:
    Archiver		Level	Original Size (bytes)	Compressed Size (bytes)	Compression Time (sec)	Used Memory (bytes)
    paq8q_sse2_intel	7	8.666.064		5.110.032		482,69			940.687.218
    paq8q_sse_intel		7	8.666.064		5.110.015		491,72			940.687.218
    paq8q_speed_optimised	7	8.666.064		5.110.015		517,37			940.687.218
    paq8p3_speed_optimised	7	8.666.064		5.112.573		522,11			932.434.402
    paq8px_sse2_intel	7	8.666.064		5.102.216		757,24			940.596.690
    Test files:
    Code:
    1.440.054 bytes Bliss.bmp
    1.424.935 bytes Borg Corporate Overview_Italiano.pdf
      773.120 bytes Nuova struttura busta paga.ppt
    1.782.840 bytes PPTVIEW.EXE
    2.121.983 bytes SNV10059.JPG
    1.123.132 bytes untitled2.PNG
    6 File(s)      8.666.064 bytes
    Test CPU specs:
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	Test_CPU_specs.PNG 
Views:	462 
Size:	15.4 KB 
ID:	615  

  16. #16
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    405
    Thanks
    155
    Thanked 235 Times in 127 Posts

    text

    Current text detection is:
    If first byte is in (c<128 && c>32) || c==10 || c==13 || c==0x12 || c==9 ) text detection begins. Then UTF-8 detection can begin. Not other way around. So if your textfile is only UTF-8 then it is not detected.

    Minimal length for detected text is 80 bytes. So if data is mixed text and binary where text is smaller then 80 bytes it is detected as binary. It is for minimize detected parts on mixed data. You can set it to 1 in paq8p3 and see what happens. It will give you logfile for it.

    One way is to detect this type of data as BINARYTEXT. And apply all models to it.
    KZo


  17. #17
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Good test, thank you DARcode. A little bit better and faster then p3.
    Could you additionally test "paq8q2_sse2_intel -7 -m6" and "paq8q2_sse2_intel -7 -m4"? -m6 should close the gap to paq8px and -m4 should be a lot faster.

  18. #18
    Member DARcode's Avatar
    Join Date
    May 2009
    Location
    Genoa, Italy
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Simon Berger View Post
    Good test, thank you DARcode. A little bit better and faster then p3.
    Could you additionally test "paq8q2_sse2_intel -7 -m6" and "paq8q2_sse2_intel -7 -m4"? -m6 should close the gap to paq8px and -m4 should be a lot faster.
    Sure, but can't compile the source of post # 12 myself , or should I use the compiles from post # 11?

  19. #19
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Yes. That bug shouldn't affect your kind of data.

  20. #20
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    958
    Thanks
    570
    Thanked 396 Times in 294 Posts
    @Simon, Kaitz,
    I was counted number of text parts detected by paq8q for my k.wad file for paq8q and paq8q2 version and for both versions numbers are the same: 2195 parts.
    It was my mistake that lenght are from 50 bytes, they are started from 80 bytes of text parts.
    Darek

  21. #21
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Quote Originally Posted by Simon Berger View Post
    I am sorry. There was a bug with UTF8 detection.
    Now all text will be printed only as TEXT.
    Fixed version compiled...

    The SSE2 builds are experimental and I don't have the required hardware to test them myself. Please use the standard or speed optimised version if the SSE2 builds don't work properly on your machine.

    EDIT: Attachments "paq8q2.zip" and "paq8q2_sse2.zip" removed.

  22. #22
    Member DARcode's Avatar
    Join Date
    May 2009
    Location
    Genoa, Italy
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Testing both m4 and m6 modes with post #21 SSE2 compiles, encoding terminates correctly but then the executable crashes with the attached error.

    Also, are you guys aware the default archive extension is ".paq8q3"?

    EDIT: OS is WinXP Pro SP2.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	paq8q2_sse2_intel.exe_error.PNG 
Views:	439 
Size:	8.1 KB 
ID:	618  
    Last edited by DARcode; 13th May 2009 at 19:49.

  23. #23
    Member DARcode's Avatar
    Join Date
    May 2009
    Location
    Genoa, Italy
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Post

    Here you go:
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	PAQ8P-X-Q.PNG 
Views:	467 
Size:	11.0 KB 
ID:	620  

  24. #24
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Thanks again. The best results I could expect. I fixed once again a possible bug in mode 1 and 6 because it deletes an array that never will be allocated.
    Additionally changed wrong paq8q3 lines. Thanks for this hint.
    Attached Files Attached Files

  25. #25
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Simon!

    Compiled...

    EDIT: Attachment "paq8q2_v3.zip" removed.
    Attached Files Attached Files

  26. #26
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    876
    Thanks
    472
    Thanked 175 Times in 85 Posts
    PAQ8PX19 detects about 5.9 MB of Wavesounds in my 63 MB Modules Set,
    the overall compression improved from 33.4 MB to 33.2 MB. It is a new world record, but there's still potential.

    The JPEG of the Mobile Set were all detected, the overall compression improved here from 468 MB to 455 MB and could still be improved if only
    PAQ8 could see the (headerless) JPEG of the Motion JPEG .mov file.

    Could this be improved with PAQ8Q?

  27. #27
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    It's sometimes a little bit better and sometimes a little bit worse. But it will never have a big improvement because models and detections are the same.

  28. #28
    Member DARcode's Avatar
    Join Date
    May 2009
    Location
    Genoa, Italy
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Arrow

    With post #25 builds and expanded a bit, slight improvements in all areas.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	PAQ8P3-PX-PQ.PNG 
Views:	465 
Size:	38.4 KB 
ID:	624  

  29. #29
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    325
    Thanks
    18
    Thanked 6 Times in 5 Posts
    When decompressing i see HDR poping up in the log, i'm assuming that is HeaDeR? If so it may be worth changing that to HEAD so peope don't actually get confused with thinking that means High Dynamic Range when dealing with compressed images.

  30. #30
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Here's a fixed (see the paq8px thread for more info) paq8q2_v3 build...
    Attached Files Attached Files

Page 1 of 3 123 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •