Yes no changes for enwik8.
We reached the max with o40 m3360
Instead we should see improved compression for enwik9.
At least i hope!
Thank you
Luca
Yes no changes for enwik8.
We reached the max with o40 m3360
Instead we should see improved compression for enwik9.
At least i hope!
Thank you
Luca
ewnik scores:
15'654'147 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v89_40_3360, change: -0,01%
123'013'220 - enwik9 -x15 -w -e1,english.dic by paq8pxd_v89_40_3360, change: -0,23%, memory used 32'130MB (by paq8pxd)
15'654'151 - enwik8 -x15 -w -e1,english.dic by paq8pxd_v89_60_4095, change: 0,00%
122'945'119 - enwik9 -x15 -w -e1,english.dic by paq8pxd_v89_60_4095, change: -0,06%, memory used 34'335MB (by paq8pxd) - and finally there is a gain about 70kB!![]()
Last edited by Darek; 19th July 2020 at 18:02.
LucaBiondi (18th July 2020)
Thank you very much Darek!
122.945.199 well is not bad at all
My goal was to push ppm_mod to the limit.
I have tried a few parameters, As soon as I have a moment I write what I felt.If kaitz or someone wants to apply these small changes to paq8pxd, I will be happy .
Luca
Gain is always gain
This changes plus maybe LSTM could give us 121'xxx'xxx bytest at the end.
Hi Darek!
I think the same!!
What do you think, what is globally the best version between 40 3360 and 60 4095?
Luca
For my testset best version is 40 3360, however as it was visible on enwik9, for big files better is 60 4095.
I'll need to test 4 corpuses on 60 4095 and then maybe this choise will be easier.
Good idea!
Luca
Scores of 4 Corpuses for paq8pxd_v89_ppm_60_4095.
Looks like for smaller files tests (Calgary, Canterbury, Maximum Compression and my testset) this version generates worse scores than 40_3360 but for bigger files tests like Silesia and enwik9 scores are better...
Then it's hard to be objective and say definetely but if we add all 5 corpuses together then 60_4095 wins and wins also for enwik9 then let it be - for my opinion 60_4095 should be best version.
Hi Darek!
Good job and thank you!
If Kaitz or Shelwien want adopt these parameters i will be happy!
Luca
It's not that easy, at least for me.
I tried to rewrite lstm.inc with the current version of LSTM present in cmix (with the differences needed to use it in paq8pxd), however the decompressed file is different from the original.
Then I started from the original lstm.inc, but I had the same problem, trying it with paq8pxd 73 (first version in which the management of lstm.inc is present) and 89 (latest version), g++ 6.3 and 7.1 with various options.
After debugging for some time, it is still not clear to me if it is a compilation problem (but I have tried 2 g++ and with different options) or in the LSTM source (it doesn't seem, but it is difficult to debug).
Has anyone else tried to enable the LSTM part?
It's not my current main work, if I can't solve it without taking too much more time I'll have to give up.![]()
@kaitz: I've found a problem with SZDD preprocessing (apparently) - GitHub issue.
This file does not make a sound roundtrip. No crashes, but the decompressed file is different to the original. I didn't actually compress the file (-s0) so I guess it's safe to assume the SZDD implementation is the cause.
By the way, great work with the preprocessor! I wonder if it could be separated into a standalone library to include it on other software, like precomp. Especially since paq8pxd is GPL so its code can't really be shared on most other projects with less restrictive licenses.
Last edited by Gonzalo; 11th October 2020 at 17:49. Reason: Replaced attachment. See GitHub issue for an explanation.
kaitz (24th October 2020)
Confirmed. I made GitHub issue for it. Will look into it in january.
---
Also uploaded paq8pxd_v90, if someone wants to test. Cant upload large files now, so source only.
KZo
Gotty (31st October 2020),Mauro Vezzosi (24th October 2020)
Hey @kaitz, I went ahead and created a CMakeLists.txt file to allow compilation using CMake/Make. See my PR here: https://github.com/kaitz/paq8pxd/pull/12![]()
PAQCompress: http://moisescardona.me/paqcompress
PAQCompress: http://moisescardona.me/paqcompress
Darek (25th October 2020)
At first my testset for paq8px_v90. Nice improvements. Still some bytes behind latest paq8px (about 100KB - without LSTM) but almost all files got some gains.
There is only some loses for biggest 24bpp images - both of them loses about 400 bytes.
Gotty (31st October 2020)
Paq8pxd90fix1
- improve jpeg compression by adding 5 apm and 1 mixer context
paq8pxdv90 -s8 test.jpg 2187172
paq8pxdv90fix1 -s8 test.jpg 2185794
paq8pxdv90 -s8 a10.jpg 623059
paq8pxdv90fix1 -s8 a10.jpg 622691y
here is source code, binary file and batch script to compile inside the package.![]()
Please submit your changes to the paq8pxd repo: https://github.com/kaitz/paq8pxd so that Kaitz can review it.
PAQCompress: http://moisescardona.me/paqcompress
You can learn git yourself, the internet is full of introductions, guides, forums. Please make some efforts. Why don't you?
https://encode.su/threads/342-paq8px...ll=1#post67129
https://encode.su/threads/342-paq8px...ll=1#post67131
https://encode.su/threads/342-paq8px...ll=1#post67147
You did not fork.
You will need to learn how to drive before you would like to sit in an actual car. Learn git.
Create git repositories in your account, and do forks, do pull requests, do merges between them until you understand it wholly.
Here are scores of paq8xd_v90 for 4 Corpuses. Nice gains for all corpuses, especially about 240KB in Silesia due to big improvement on Mozilla file.
Unfortunatelly 3 files from MaximumCompression crashes during compression with -x15 option => A10.JPG, ohs.doc and maximumCompression.tar.
Gotty (2nd November 2020)
Kaitz,
Line 3599:
int p1 = state?Maps8b[i]->p(state,m.x.y):0;->
int p1 = state?Maps8b[i]->p(state,m.x.y):2048;?
How about enwik9 result using -x15 -w -e1,English.dic ?
Its ok, p1 is used only state!=0. Should be inside if statment.
Jpeg error is my mistake. Old mistake i keep making for larger levels.
Enwik9 may not be better from _60_4095, no changes from that version. Probably no impovment.
KZo
Doing some compression testing using paq8pxd V90 (SSE compile; can't use AVX2 compile).
A pdf file being compressed generates a lot of "Transform fails at 0, skipping . . . " messages when compressing this file!
paq8pxd v90 does compress the pdf file in the end, but I suspect compression would be better if paq8pxd v90 didn't "fail" on reading/interpreting (parts of) the pdf file(?)
File attached (7z format compressed).
Can anyone check this and maybe provide a solution to enable full compression of this file, please?