Page 31 of 62 FirstFirst ... 21293031323341 ... LastLast
Results 901 to 930 of 1857

Thread: paq8px

  1. #901
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    505
    Thanks
    207
    Thanked 343 Times in 182 Posts
    So s10 is pxd version with -s10 option?
    If so its not fair comprsion to px version (less mmory).
    On thing is sure, pxd version has problems with samba file after i added spilt-stream compression in v11-v16 or so.
    And for some files eol filter helps allot.
    KZo


  2. #902
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    505
    Thanks
    207
    Thanked 343 Times in 182 Posts
    Quote Originally Posted by mpais View Post
    That is to be expected, Kaido has added TAR support to paq8pxd, among many other formats that paq8px doesn't recognize.
    Tar is not active with -s option. It actualy hurts compression.
    KZo


  3. #903
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,100
    Thanks
    677
    Thanked 431 Times in 329 Posts
    Quote Originally Posted by mpais View Post
    I don't really care what version "wins" Darek, to me the fun is in improving it, be it this version or any other.
    I'm fully agree, but it's quite exciting to look on the compressors development in that way

    It's good to know that there are some low hanging fruits to catch.

  4. #904
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,100
    Thanks
    677
    Thanked 431 Times in 329 Posts
    Quote Originally Posted by kaitz View Post
    So s10 is pxd version with -s10 option?
    If so its not fair comprsion to px version (less mmory).
    I Know it, but I've compare best option for each compressor.
    I treat more memory usage as additional advantage like additional model, parser or method - for example like wrt usage by pxd.
    You have right that is not comparison of apples to apples but I've try to squeeze out maximum of each compressors and then compare. And it could be fair enough for me - don't you think?

  5. #905
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    Quote Originally Posted by kaitz View Post
    Tar is not active with -s option. It actualy hurts compression.
    I stand corrected, and it's quite interesting that it hurts compression. Guess I'll have to look more closely at your version Kaido.

    I've seen that you have a transform for DEC Alpha (by Matt, if I'm not mistaken) but it's inactive. I'm assuming it's because your
    detection heuristic gives many false positives?

    In EMMA, I have an inactive model for ARMv7 and a detection routine. The model is similar to the one I made for paq8
    (parsing and quantization of the instructions) but it's still unfinished, mainly because I didn't have enough samples
    (I don't like to use few samples, it easily leads to overfitting). The detection heuristic however is actually quite solid,
    my main problem was that you can have Thumb2 instructions in the stream that would wreck havoc with the model.
    Given the predominence of ARM processors nowadays, it might be interesting to give paq8 a model for that.

    Quote Originally Posted by Darek View Post
    It's good to know that there are some low hanging fruits to catch.
    Well, I've mainly just been teasing a few small improvements here and there, to see if we can get more people interested
    in researching better context modelling techniques. As you can see, it's not working

  6. Thanks:

    Darek (18th September 2017)

  7. #906
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    505
    Thanks
    207
    Thanked 343 Times in 182 Posts
    Quote Originally Posted by mpais View Post
    I stand corrected, and it's quite interesting that it hurts compression. Guess I'll have to look more closely at your version Kaido.
    I had regular transform. Tar headers on front and rest of data fallows. Did not work. Then i used hdr style attempt, same. And finaly i left it in for -q mode with 512 byte hdr (you can compile with extra padding option). It may help on px version.

    Quote Originally Posted by mpais View Post
    I've seen that you have a transform for DEC Alpha (by Matt, if I'm not mistaken) but it's inactive. I'm assuming it's because your
    detection heuristic gives many false positives?
    Indeed.
    Quote Originally Posted by mpais View Post
    In EMMA, I have an inactive model for ARMv7 and a detection routine. The model is similar to the one I made for paq8
    (parsing and quantization of the instructions) but it's still unfinished, mainly because I didn't have enough samples
    (I don't like to use few samples, it easily leads to overfitting). The detection heuristic however is actually quite solid,
    my main problem was that you can have Thumb2 instructions in the stream that would wreck havoc with the model.
    Given the predominence of ARM processors nowadays, it might be interesting to give paq8 a model for that.
    This might be interesting as a test.
    KZo


  8. #907
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,100
    Thanks
    677
    Thanked 431 Times in 329 Posts
    .

  9. #908
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts

    paq8px_v113

    I've improved the 24/32bpp image model, both for PNG and non-PNG images.

    Code:
    File: flumy.png, 4.696.877 bytes (SqueezeChart)
    paq8px_v112       1.848.030 bytes
    paq8px_v113       1.803.992 bytes

  10. Thanks (2):

    Darek (10th October 2017),Stephan Busch (11th October 2017)

  11. #909
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,100
    Thanks
    677
    Thanked 431 Times in 329 Posts
    Scores for my testbed for v113. Image files crunched even more!
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	paq8px_v113.jpg 
Views:	108 
Size:	428.2 KB 
ID:	5398  

  12. #910
    Member polemon's Avatar
    Join Date
    Jul 2010
    Location
    Germany
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hmm, the newer versions of paq8px don't compile on Linux, I had the v112 almost done having it compiling on Linux, I'll see if I can get it to compile on linux, too...

  13. #911
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts

    paq8px_v114

    I made a few more improvements to the 24/32bpp image model, especially on the PNG variant.
    The 8bpp color-indexed model is also slightly improved.
    The heuristic for the record model now attempts to detect 24bpp images, and the model now uses 2 contexts for 8bpp and 24bpp images.
    This helps on images that aren't detected by the parsers, such as Tif images.

    Code:
    File: flumy.png, 4.696.877 bytes (SqueezeChart)
    paq8px_v113       1.803.992 bytes
    paq8px_v114       1.769.303 bytes

  14. Thanks (3):

    Darek (15th October 2017),Mike (15th October 2017),Stephan Busch (15th October 2017)

  15. #912
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,100
    Thanks
    677
    Thanked 431 Times in 329 Posts
    There are scores for my testset. This version got some improvements on image files as you wrote.
    However there are quite fine improvement on WAV files (0.WAV and L.PAK) and some nice gain on K.WAD and Q.WK3 files.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	cmv_020_paq8px_v114.jpg 
Views:	84 
Size:	467.5 KB 
ID:	5413  

  16. #913
    Member
    Join Date
    Sep 2015
    Location
    Italy
    Posts
    270
    Thanks
    112
    Thanked 153 Times in 112 Posts
    I'm trying to compile paq8px_v114.cpp with MinGW-W64 5.2, 6.3 and 7.1, with and without -std=c++11 and -std=c++14 and I always have errors.
    E.g. using 7.1 without -std=... in
    g++ paq8px_v114.cpp -DWINDOWS -lz -Wall -Wextra -O3 -static -static-libgcc -opaq8px_v114.exe (command written in line 112 of paq8px_v114.cpp)
    Code:
    paq8px_v114.cpp: In function 'void im24bitModel(Mixer&, int, int, int)':
    paq8px_v114.cpp:2538:112: error: use of deleted function 'StationaryMap::StationaryMap(StationaryMap&&)'
       static StationaryMap Map[nMaps] = { 12, 12, 12, 12, 10, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 0 };
                                                                                                                    ^
    paq8px_v114.cpp:1738:7: note: 'StationaryMap::StationaryMap(StationaryMap&&)' is implicitly deleted because the default definition would be ill-formed:
     class StationaryMap {
           ^~~~~~~~~~~~~
    paq8px_v114.cpp:1738:7: error: 'Array<T, Align>::Array(const Array<T, Align>&) [with T = unsigned int; int Align = 16]' is private within this context
    paq8px_v114.cpp:757:3: note: declared private here
       Array(const Array&);  // no copy or assignment
       ^~~~~
    paq8px_v114.cpp:1743:3: note:   after user-defined conversion: StationaryMap::StationaryMap(int, int, int)
       StationaryMap(int BitsOfContext, int BitsPerContext = 8, int Rate = 0): Data(1<<(BitsOfContext+BitsPerContext)), Context(0), Mask(1<<BitsPerContext), bCount(1), B(1) {
       ^~~~~~~~~~~~~
    [...]
    Màrcio, how you compiled it? Do you use MinGW-W64?

  17. #914
    Member
    Join Date
    Sep 2014
    Location
    Italy
    Posts
    58
    Thanks
    63
    Thanked 28 Times in 18 Posts
    Hi Mauro,

    I compile paq8px using MinGW-W64 with this bat file:
    Code:
    @echo off
    
    set incs=-Izlib -DNDEBUG -DINC_LOG2I -DINC_FLEN -DSTRICT -DWIN32
    
    set opts=-fomit-frame-pointer -fstrict-aliasing -fno-stack-protector -fno-stack-check -fno-check-new -floop-interchange -floop-strip-mine -floop-block -funroll-loops -fpeel-loops -fweb -ffast-math
    :-finline-functions -ftree-vectorize -ftree-loop-vectorize -flto
    
    set arch=-march=native -mtune=native -m64 -ftree-vectorize -fprofile-use
    :-mpreferred-stack-boundary=3 -msse2 -fgcse-after-reload -funsafe-math-optimizations -fassociative-math -freciprocal-math -fbranch-probabilities 
    
    :
                                                                        
    set arg=2
    if not (%1)==() set arg=%1
    
    if (%arg%)==(0) set gcc=C:\mingw\mingw64-w64\bin\g++.exe& set exe=paq8px_x64.exe
    if (%arg%)==(1) set gcc=C:\mingw\mingw64-w64\bin\g++.exe& set exe=paq8px_x64.exe
    if (%arg%)==(2) set gcc=C:\mingw\mingw64-w64\bin\g++.exe& set exe=paq8px_x64.exe
    
    set path=%gcc%\..\
    
    %gcc%\..\gcc.exe -c -O9 %arch% %incs% %opts% -static @zliblist
    
    :-fno-exceptions -Ofast
    %gcc% -O9 -s %arch% %incs% %opts% -fwhole-program -fpermissive -std=gnu++1z -fno-rtti -static paq8px.cpp *.o -o %exe%
    
    del *.o

  18. Thanks:

    Mauro Vezzosi (21st October 2017)

  19. #915
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts

    paq8px_v115

    I've improved the record model and made a few tweaks to the DMC model.

    Code:
    File: sao, 7.251.944 bytes, Silesia Corpus
    paq8px_v114              3.747.458 bytes
    paq8px_v115              3.741.969 bytes

  20. Thanks:

    Darek (22nd October 2017)

  21. #916
    Member
    Join Date
    Sep 2015
    Location
    Italy
    Posts
    270
    Thanks
    112
    Thanked 153 Times in 112 Posts
    @Luca
    Thank you, I can compile paq8px_v114 adding -std=gnu++1z (AFAIK, it enables GNU extensions of experimental support for C++1z/17: will the other compilers have problems?)
    @Màrcio
    I'm starting to improve the DMC model, there are "many" things to try to change.
    Do you continue to tweak DMC or I can work on it?
    I guess that there are some missing "break" at the end of these "case".
    Code:
    paq8px_v114.cpp: In function 'void ProcessMode(Instruction&, ExeState&)':
    paq8px_v114.cpp:4608:25: warning: this statement may fall through [-Wimplicit-fa
    llthrough=]
           case fDR : Op.Data|=(2<<TypeShift);
                      ~~~~~~~^~~~~~~~~~~~~~~~
    paq8px_v114.cpp:4609:7: note: here
           case fDA : Op.Data|=(1<<TypeShift);
           ^~~~
    paq8px_v114.cpp:4609:25: warning: this statement may fall through [-Wimplicit-fa
    llthrough=]
           case fDA : Op.Data|=(1<<TypeShift);
                      ~~~~~~~^~~~~~~~~~~~~~~~
    paq8px_v114.cpp:4610:7: note: here
           case fAD : {
           ^~~~
    paq8px_v114.cpp:4640:22: warning: this statement may fall through [-Wimplicit-fa
    llthrough=]
             Op.BytesRead = 0;
             ~~~~~~~~~~~~~^~~
    paq8px_v114.cpp:4642:7: note: here
           default: State = Start; /*no immediate*/
           ^~~~~~~

  22. Thanks:

    mpais (22nd October 2017)

  23. #917
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts

    paq8px_v116

    The first cases are supposed to be like that, but not the last line - "Op.BytesRead = 0;"
    I knew something was off with the new model, it should beat EMMA almost all the time, but since I don't have a debugger (I'm just using Notepad++ and compiling in the command line) and always skip the warnings, I didn't think it was something so simple. Thank you Mauro, the difference is abysmal:

    Code:
    File: acrord32.exe, 3.870.784 bytes
    paq8px_v115          865.592 bytes
    paq8px_v116          857.345 bytes
    This is what I get for forgetting that in C\C++ I need to manually break from a switch statement (Delphi is different).

    Feel free to tweak the DMC model Mauro, I'll have to port these changes to cmix and I'm also working on ZCM, so I probably won't release a new version of paq8px until the next weekend.

  24. Thanks (3):

    Darek (22nd October 2017),Mike (22nd October 2017),Stephan Busch (22nd October 2017)

  25. #918
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,100
    Thanks
    677
    Thanked 431 Times in 329 Posts
    My testset compressed by v115. Nice gain for most nonmodel files -up to 0.7%.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	paq8px_v114.jpg 
Views:	80 
Size:	473.8 KB 
ID:	5436  

  26. Thanks:

    mpais (22nd October 2017)

  27. #919
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,100
    Thanks
    677
    Thanked 431 Times in 329 Posts
    And my testbed test for v116. Improvement on I.EXE file. Other files got some bit differences.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	paq8px_v116.jpg 
Views:	134 
Size:	473.0 KB 
ID:	5437  
    Last edited by Darek; 22nd October 2017 at 11:54.

  28. #920
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    236
    Thanked 90 Times in 70 Posts
    Hello thread! I've being toying around with paq8px for a few days and I hope you can shed some light on the inner working of it. In particular, I was amazed with how much time it spend on some tasks that aren't very compute intensive. See this for example:

    Code:
    paq8px_v116.exe -0 zpaqgui-win32.win32.x86.zip
    
    ...
    
    Compressed from 22789601 to 57496695 bytes.
    
    Total 22789601 bytes compressed to 57496746 bytes.
    Time 1863.51 sec, used 10845093 bytes of memory
    In comparison to:

    Code:
    precomp -cn -intense -brute 
    
    ...
    
    100.00% - New size: 57453657 instead of 22789601     
    
    
    Done.
    Time: 2 minute(s), 28 second(s)
    As you can see, paq8px is about 12-13 times slower than precomp for the very same task. Why could this be? And more important, is it possible to improve this timing? I think that in doing so, paq8px could be so much faster, and maybe use some of the saved cpu time on the main compression engine. TIA - Gonzalo M.

  29. #921
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    I can't reproduce it. I downloaded zpaqgui-win32.win32.x86.zip from https://github.com/thometal/zpaqgui/releases, the filesize is different than yours but I don't think that would explain the difference in our results:

    Code:
    paq8px_v116.exe -0 zpaqgui-win32.win32.x86.zip
    ...
    Total 22788604 bytes compressed to 57494485 bytes.
    Time 179.20 sec, used 10845128 bytes of memory
    
    
    precomp -cn -intense -brute zpaqgui-win32.win32.x86.zip
    ...
    100.00% - New size: 57451397 instead of 22788604
    
    Done.
    Time: 2 minute(s), 58 second(s)

  30. #922
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    236
    Thanked 90 Times in 70 Posts
    hmmm... that's strange... CPU usage is normal. Somewhat above one core of the four.
    And it is not eclusive to this file. In everyone I tried is the same difference on speed for me. Maybe I lack some CPU features used by paq8px? I have an Intel Atom. Here are the specs:
    Attached Files Attached Files

  31. #923
    Member
    Join Date
    Sep 2015
    Location
    Italy
    Posts
    270
    Thanks
    112
    Thanked 153 Times in 112 Posts
    I enabled the assert() checks and these are the first three assertions failed:
    Code:
    1) Pre-training x86/x64 model...Assertion failed!
       File: paq8px_v116.cpp, Line 897
       Expression: i>0
    
       The error is in the line 4976 when i == 1:
       if (i) mask=mask*2+(buf(i-1)==0), count0+=mask&1;
    
    2) I bypassed the previous assertion failed with "if (i>1)" (I don't know if it's right or not).
       [...]
       File list (20 bytes)
       Assertion failed!
       File: paq8px_v116.cpp, Line 1957
       Expression: (((unsigned long long)cp[i])&63)>=15
    
       I changed the cast from long() to (unsigned long long)() to compile without errors.
       I didn't search who calls ContextMap::mix1().
    
    3) I bypassed the previous assertion failed commenting out the assert(), I do not know if this error may be due to the previous one.
       [...]
       1/1  Filename: english.dic (465211 bytes)
       Block segmentation:
        0           | default          |    465211 bytes [0 - 465210]
       Compressing... Assertion failed!
       File: paq8px_v116.cpp, Line 1226
       Expression: p>=0 && p<4096
    
       I didn't search who calls stretch().
    In my opinion, we should enable assert() (commenting out the line 606 "#define NDEBUG") and compiler warnings (e.g. -Wall -Wextra in MinGW-w64) at least one time before post a program, it's quicker to fix an error asap instead to fix it weeks or months (or years?) later.

    Quote Originally Posted by Mauro Vezzosi View Post
    I'm starting to improve the DMC model, there are "many" things to try to change.
    DMC is harder to improve than I expected and the gain I got is ridiculous and not persistent.
    @Márcio
    I still work on it, however feel free to work on paq8px.

  32. Thanks:

    mpais (28th October 2017)

  33. #924
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Hungary
    Posts
    399
    Thanks
    278
    Thanked 283 Times in 149 Posts
    Quote Originally Posted by Gonzalo View Post
    hmmm... that's strange... CPU usage is normal. Somewhat above one core of the four.
    And it is not eclusive to this file. In everyone I tried is the same difference on speed for me. Maybe I lack some CPU features used by paq8px? I have an Intel Atom. Here are the specs:
    Execution times for paq8px are consistent with the CPU (single thread) speeds.
    But precomp execution times are not.
    See the attached table:

    Click image for larger version. 

Name:	exec_times.png 
Views:	154 
Size:	13.3 KB 
ID:	5451
    (CPU single thread ratings are from cpubenchmark.net)

    @Márcio, did you run your benchmark with precomp 0.4.6? Your 178-second runtime is waaay too slow. It should be around 36 seconds. Your CPU thread speed is ~7x faster than Gonzalo's CPU thread speed, and executing paq took ~1/10th of the time. It is very unlikely that executing precomp on this CPU takes approximately the same amount of time as on an Atom.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	exec_times.png 
Views:	51 
Size:	13.9 KB 
ID:	5450  
    Last edited by Gotty; 28th October 2017 at 02:03.

  34. Thanks:

    Gonzalo (28th October 2017)

  35. #925
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    236
    Thanked 90 Times in 70 Posts
    There are a few enhancements of my own to the precomp run that might affect the measures:

    1) Compiling options: -O2 and -O3 replaced with -Ofast, added -march=native, -mtune=native
    2) Process run from ram drive to avoid disk thrashing.

    But there isn't much difference though. New timing with a non optimized compile run from a normal folder:

    Code:
    100.00% - New size: 57453657 instead of 22789601     
    
    Done.
    Time: 2 minute(s), 48 second(s)
    Maybe Marcio is using the old precomp without the brute mode optimisations...

  36. #926
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    611
    Thanks
    246
    Thanked 240 Times in 119 Posts
    Using 0.4.5 would explain the inconsistency on mpais' side, but I doubt it, as it would have taken hours (can't check myself at the moment).

    Regardless of the CPU inconsistencies, when comparing paq8px -0 to Precomp -cn, don't use brute. It looks for raw streams without the 2 byte zlib header, which paq8px -0 doesn't. So using only -cn -intense is closer to what paq8px -0 does.
    http://schnaader.info
    Damn kids. They're all alike.

  37. #927
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    611
    Thanks
    246
    Thanked 240 Times in 119 Posts
    Made a quick test here (DELL Latitude E6510) and there's a factor 6 between Precomp and paq8px - 1 min 8 s for Precomp 0.4.6 (both with and without brute, makes no difference for this file), 5 min 58 s for paq8px.

    I first saw the deep recursion (depth 2) and suspected console output or temporary files, but then I noticed CPU usage was at 100% (for one core) most of the time. This most likely means zlib parameter trial is the culprit (which of the 81 parameters to use). I'll have a closer look at it later to be sure. Precomp remembers the combinations that were used must often and tries these first. For paq8px, I'm not sure, think there's some heuristic similar to the one AntiZ uses.

    Also note that the most used stream type in the file is ZIP (>10000 streams) which works a bit different in paq8px than the other stream types.
    http://schnaader.info
    Damn kids. They're all alike.

  38. #928
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    @Mauro Vezzosi
    Thank you, it's fixed, I'll release the fixes in the next version.

    The 1st assertion failure was in the zero-mapping part of the new exe model, it didn't cause any major problems so I didn't notice it.
    buf(n) returns the n-th byte back in the rotating input buffer, so I checked if i>0 but forgot I required i-1.

    The 2nd assertion is obviously wrong on 64-bit compiles (it's casting a pointer to 32-bits), but the reason it's failing is because
    ContextMap expects its hash table to be aligned to 64-byte cache-line boundaries. The improvements to the memory allocation
    made by Shelwien distinguish between allocation requests over 1MB, and blocks up to 1MB are manually aligned, by default,
    to 16-byte. I had changed it to allow for other alignments, but didn't change the ContextMap's variable "Array<E> t" to "Array<E,64> t"

    The 3rd assertion failed because my StationaryMap rounded the prediction, so the effective possible prediction interval was [0..4096],
    and I didn't check if "stretch()" had checks for that (I do in EMMA). Since rounding has little to no effect on compression, I simply
    removed it.

    The DMC model usually doesn't contribute much to compression. The reason I made models optional in EMMA was precisely to allow
    for testing the contribution of each individual model separately. You seem to have good sparse contexts in CMV, so maybe you'll have
    better luck trying to improve the sparseModel instead.

    @Gonzalo, Gotty, schnaader
    I'm using precomp.exe from the latest 0.4.6 release from GitHub. I re-run the test and actually got a worse result (3m 06s).

    Also, note that I'm not running it on the i5 2400, but on my i7 5820k at 4.4Ghz, which better accounts for the speedup from paq8px.

    So it seems my file being slightly different from yours is probably the reason for it, since I didn't know you were running on an Atom processor.
    Have you tried using the file I used?

  39. Thanks:

    Mauro Vezzosi (28th October 2017)

  40. #929
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    611
    Thanks
    246
    Thanked 240 Times in 119 Posts
    Quote Originally Posted by mpais View Post
    So it seems my file being slightly different from yours is probably the reason for it, since I didn't know you were running on an Atom processor.
    Have you tried using the file I used?
    I used the zpaqgui file from the GitHub releases, too, so it has to be something else. But we're very likely comparing apples to oranges here when CPU and HD/SSD performance differs that much. Also, I made the tests under Windows, I guess you two used Linux?
    http://schnaader.info
    Damn kids. They're all alike.

  41. #930
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    611
    Thanks
    246
    Thanked 240 Times in 119 Posts
    Looking closer at the file and paq8px, it's definitely the zlib parameter trial. Note that it still doesn't explain the result from mpais, perhaps the HD is too slow there and maskes the CPU time.

    There are almost 13000 ZIP streams in the file, so the recompression (calls to deflate) dominates CPU time. In paq8px, all 81 combinations are tested in a big loop and those with differences break out of it early with a "continue" statement, see line 6871:

    Code:
         for (int j=0; j<81; j++) {
            if (diffCount[j]>=LIMIT) continue;
            rec_strm[j].next_in=&zout[0];  rec_strm[j].avail_in=BLOCK-main_strm.avail_out;
            rec_strm[j].next_out=&zrec[recpos[j]]; rec_strm[j].avail_out=BLOCK*2-recpos[j];
            int ret=deflate(&rec_strm[j], (int)main_strm.total_in == len ? Z_FINISH : Z_NO_FLUSH);
    The constant LIMIT is set to 128, so all the combinations that differ by more than 128 bytes won't be tried any longer. Usually, this leaves only the "correct" combination and only some CPU time is wasted. Using Precomp in verbose mode, we can have a look at which combinations work for the file. E.g. the very first ZIP stream is:

    Code:
    (0.01%) Possible zLib-Stream in ZIP found at position 1348
    [...]
    Best match with level combination 65, windowbits = 15: 2030 bytes, decompressed to 16597 bytes
    Precomp also tells us all used combinations for the file at the end (-zl suggestion): 58,65,68,73,84,85,93 - we can count them in the -v output to see how often each was used:

    Code:
          65: 10828 times
          68: 2014 times
          93: 6 times
          84,85: 3 times
          58, 73: once
    What happens with our first stream if we enforce the second one using "-zl68"?

    Code:
    (0.01%) Possible zLib-Stream in ZIP found at position 1348
    [...]
    Best match with level combination 68, windowbits = 15: 2030 bytes, decompressed to 16597 bytes
    Well, apparently nothing. There's some vagueness in the zlib parameters, so some of them give the same result. Which explains the paq8px result - it just deflates most of the streams multiple times. In Precomp, I sort combinations by how often they were used and try them one after the other, so combination 65 which occurs 10828 times quickly dominates. Code is at line 3446:

    Code:
              for (windowbits = -15; windowbits < -7; windowbits++) {
                for (int index = 0; index < 81; index++) {
                  if (levels_sorted[index] == -1) break;
                  int comp_level = (levels_sorted[index] % 9) + 1;
                  int mem_level = (levels_sorted[index] / 9) + 1;
    
                  try_recompress(fin, comp_level, mem_level, windowbits, compressed_stream_size, retval, in_memory);
    
                  if (final_compression_found) break;
                }
                if (final_compression_found) break;
              }
    In fact, Precomp is lucky that the windowbits parameter (that usually is given in the zlib header, but not present for ZIP streams) is -15 for all streams and it doesn't have to test all the others. paq8px only uses -15 for ZIP streams, btw.

    Also note that Precomp's parameter sorting still has drawbacks. E.g. if we'd concatenate the test file with another one where some other combination is used, Precomp would still prefer combination 65 at least 10000 times until the other combination dominates. Speaking in PAQ terms, it's a zlib combination predictor strategy that works good for homogeneous files or for files where none or only some of the combinations dominate.
    http://schnaader.info
    Damn kids. They're all alike.

  42. Thanks:

    mpais (29th October 2017)

Page 31 of 62 FirstFirst ... 21293031323341 ... LastLast

Similar Threads

  1. FrontPAQ - GUI frontend for PAQ8PF and PAQ8PX
    By LovePimple in forum Download Area
    Replies: 26
    Last Post: 17th January 2019, 13:36
  2. Alternative paq8px builds
    By M4ST3R in forum Download Area
    Replies: 20
    Last Post: 25th June 2010, 16:19
  3. Optimized paq7asm.asm code not compatible with paq8px?
    By M4ST3R in forum Data Compression
    Replies: 7
    Last Post: 3rd June 2009, 15:34

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •