Scores of 4 Corpuses for paq8pxd_v93. The best overall scores for all 4 tests and majority of files!![]()
LucaBiondi (1st January 2021)
@kaitz - happy New Year and wish the next will be better!![]()
kaitz (1st January 2021),LucaBiondi (1st January 2021),schnaader (8th January 2021),xinix (1st January 2021)
paq8pxd_v94
Code:-change jpeg model -fix/change detection (tiff,text)
Code:paq8pxd_v93 -s8 mill.jpg 7132151 4933064 215.93 sec paq8pxd_v94 -s8 mill.jpg 7132151 4928860 222.67 sec
KZo
Darek (8th January 2021),Mike (8th January 2021),moisesmcardona (8th January 2021),xinix (7th January 2021)
Scores of my testset for paq8pxd v94. For F.JPG file there is 215 bytes of gain. Other files remains the same.
kaitz (13th January 2021)
paq8pxd_v95
Code:jpeg model: -more context in Map1 (20) -more inputs from main context -2 main mixer inputs + 1 apm -cleanupSo mill.jpg is 18571 bytes better v95 vs v94.Code:Size Compressed Sec paq8pxd_v95 -s8 a10.jpg 842468 618555 43.42 sec 1984 MB paq8px_v200 -8 a10.jpg 842468 624597 26.51 sec 2602 MB paq8pxd_v95 -s8 mill.jpg 7132151 4910289 350.38 sec 1984 MB paq8px_v200 -8 mill.jpg 7132151 4952115 228.65 sec 2602 MB paq8pxd_v95 -s8 paq8px_v193_4_Corpuses.jpg 3340610 1367528 167.13 sec 1984 MB paq8px_v200 -8 paq8px_v193_4_Corpuses.jpg 3340610 1513850 105.90 sec 2602 MB paq8pxd_v95 -s8 DSCN0791.AVI 30018828 19858827 1336.94 sec 1984 MB paq8px_v200 -8 DSCN0791.AVI 30018828 20171981 992.85 sec 2602 MB
Its slower, im sure nobody cares. Some main context changes have 0 time penalty but improve result some kb.
For a10.jpg new Map1 context add only about 5 sec.
KZo
Darek (14th January 2021),Mike (13th January 2021),moisesmcardona (14th January 2021)
Where can I get mill.jpg and dscn0791.avi files? Thank you.. could you upload them here please ?
No
KZo
Scores of my testset and $ Corpuses for paq8pxd v94 and paq8pxd v95.
Good improvements on JPG files and files contains such structures.
A10.JPG file got 618'527 bytes!
kaitz (17th January 2021)
Scores of paq8pxd v95 and previous versions for enwik8 and enwik9:
15'654'151 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v89_60_4095, change: 0,00%, time 10422,07s
122'945'119 - enwik9 -x15 -w -e1,english.dic by paq8pxd_v89_60_4095, change: -0,06%, time 100755,31s - best score for paq8pxd versions
15'647'580 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v90, change: -0,04%, time 9670,5s
123'196'527 - enwik9 -x15 -w -e1,english.dic by paq8pxd_v90, change: 0,20%, time 110200,16s
15'642'246 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v95, change: 0,00%, time 10130s- best score for paq8pxd versions (the same as paq8pxd v93 and paq8pxd v94)
123'151'008 - enwik9 -x15 -w -e1,english.dic by paq8pxd_v95, change: 0,00%, time 102009,55s
kaitz (18th January 2021)
@kaitz, I opened another MR https://github.com/kaitz/paq8pxd/pull/15.
Since you added a BZip2 transform, the BZip2 library needed to be added to the CMakeLists so that it can detect it and allow to compile your latest versions![]()
PAQCompress: http://moisescardona.me/paqcompress
paq8pxd_v99
1 thing helps on silesia webster/xmlCode:-xml/html like content processed separately in wordmodel -adjust some wordmodel parameters -attempt to detect ISO latin text -some changes in detection (cleanup)
2 helps 1bit on most text files (1k loss on enwik8)
3 helps on silesia samba maybe mozilla and any other text file prev detected as bintext
4 in tar mode some files are treated as text by extension without confirming (.c,.h,.html,.cpp,.po,.txt), just a bit faster processing. (for examp, linux kernel source tar)
KZo
Darek (24th January 2021),Mike (21st January 2021),moisesmcardona (21st January 2021),xinix (22nd January 2021)
Darek (22nd January 2021)
paq8pxd_v100
Fixes https://encode.su/threads/1464-Paq8p...ll=1#post66290Code:add lstm model back (active on -x option)used on all predictors exept audio add matchModel from paq8px_v201 as second model adjust old matchModel parameters tar header as bintext add back 1 mixer context if DECA in sparsemodel (default) add 2 contexts adjust normalModel
KZo
Darek (24th January 2021),Mauro Vezzosi (25th January 2021),Mike (24th January 2021),moisesmcardona (24th January 2021),xinix (24th January 2021)
Scores for paq8pxd v99 and paq8pxd v100 for my testset.
Paq8pxd v99 version is about 15KB better than paq8px v95.
paq8pxd v100 version is abput 35KB better than paq8px v99 - mainly due to lstm implementation, however it's not as big gain like in paq8px version (about 95KB between non-lstm and lstm versions)
Timings for paq8pxd v99, paq8pxd v100 and paq8px v200 (-l) versions:
paq8pxd v99 = 5'460,32s
paq8pxd v100 = 11'610,80s = 2.1 times slower - it's still about 1.7 times faster than paq8px
paq8px v200 = 19'440,11s
kaitz (25th January 2021)
by using paq8sk44 -s8 option on f.jpg (DBA corpus) the result is
Total 112038 bytes compressed to 80194 bytes.
Time 19.17 sec, used 2444 MB (2563212985 bytes) of memory
Darek (26th January 2021)
silesia
v100 breaks osdb.Code:paq8pxd v99 v100 -s8 -s8 diff dickens 1895705 1895269 436 mozilla 6917463 6910405 7058 mr 1999233 1998160 1073 nci 807857 801198 6659 ooffice 1305484 1301817 3667 osdb 2025419 2059676 -34257 reymont 759011 758606 405 samba 1680535 1676684 3851 sao 3734168 3733871 297 webster 4637776 4635525 2251 x-ray 3575990 3577183 -1193 xml 247545 246671 874 Total 29586186 29595065 -8879
KZo
Damn bzip2, I have it in [...]\MinGW_7.1\x86_64-7.1.0-release-win32-sjlj-rt_v5-rev1\mingw64\opt and I had to add it by hand:
g++ paq8pxd.cpp -DWINDOWS -msse2 -O3 -s -static -lz -I"[...]\MinGW_7.1\x86_64-7.1.0-release-win32-sjlj-rt_v5-rev1\mingw64\opt\include" -L"[...]\MinGW_7.1\x86_64-7.1.0-release-win32-sjlj-rt_v5-rev1\mingw64\opt\lib" -lbz2.dll -o paq8pxd.exe
and also copy "[...]\MinGW_7.1\x86_64-7.1.0-release-win32-sjlj-rt_v5-rev1\mingw64\opt\bin\libbz2-1.dll" to the current directory because paq8pxd requires it.
Were the fixes here?
99
13013 int inputs() {return 0;}
13014 int nets() {return 0;}
13015 int netcount() {return 0;}
100
13745 int inputs() {return 2+1+1;}
13746 int nets() {return (horizon<<3)+7+1+8*256;}
13747 int netcount() {return 1+1;}
Scores for paq8pxd v99 and paq8pxd v100 on 4 Corpuses.
Very nice changes and good improvements according to latest version (v95):
Calgary: paq8pxd v99 = 484 bytes gain to previous version, paq8pxd v100 = 817 bytes gain for v99
Canterbury: paq8pxd v99 = no gain to previous version, paq8pxd v100 = 258 bytes gain for v99
Maximum Compression: paq8pxd v99 = 3.4KB gain to previous version, paq8pxd v100 = 40.9KB gain for v99 - good lstm improvement!
Silesia: paq8pxd v99 = 150KB gain to previous version (!!!), paq8pxd v100 = about 170KB gain for v99 - paq8pxd is close to break 29'000'000 bytes score! And for -xN option there no such bad impact for "osdb" file.
kaitz (2nd February 2021)
enwik8 score for paq8px_v100 compared to paq8pxd_v95:
15'642'246 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v95, change: 0,00%, time 10'130,00s
123'151'008 - enwik9 -x15 -w -e1,english.dic by paq8pxd_v95, change: 0,00%, time 102'009,55s
15'582'810 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v100, change: -0,37%, time 22'066,98s
122'683'070 estimated - enwik9 -x15 -w -e1,english.dic by paq8pxd_v100, change: -0,37%, time - about 220'000s
Maybe as usual?
Just use MinGW and the supplied Makefile.
PAQCompress: http://moisescardona.me/paqcompress
I'm having a bit of a problem compiling...
I'm using an up-to-date Manjaro Linux machine with gcc 10.2.0 and following the instructions of the README:
But I'm getting errors and compilation cancelled.Code:cmake . -DUNIX=ON -DMT=ON -DNATIVECPU=ON make
Direct invocation of gcc using `g++ paq8pxd.cpp -DUNIX -DMT -msse2 -O3 -s -static -lpthread -lz -o paq8pxd` also doesn't work
LOG.txt
enwik8 score for paq8px_v100 compared to paq8pxd_v95:
15'642'246 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v95, change: 0,00%, time 10'130,00s
123'151'008 - enwik9 -x15 -w -e1,english.dic by paq8pxd_v95, change: 0,00%, time 102'009,55s
15'582'810 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v100, change: -0,38%, time 22'066,98s
122'586'151 - enwik9 -x15 -w -e1,english.dic by paq8pxd_v100, change: -0,46%, time - 219'956,56s
Both, enwik8 and enwik9 scores are the best for paq8pxd serie!
kaitz (2nd February 2021)
From the logs it seems to be an issue with some intrinsic headers. What CPU are you using? Tried with -DNATIVECPU=OFF? Doesn't seem to be related by itself to paq8pxd but rather an issue with those files it is calling on your machine.
True, but the CMAKE method should work and removes the need to use .bat files.
PAQCompress: http://moisescardona.me/paqcompress
Update: pthreads wasn't being linked on Linux when -DMT=ON. I've fixed this in the cmakelists file: https://github.com/kaitz/paq8pxd/pull/16
PAQCompress: http://moisescardona.me/paqcompress