Yes. Sorry for mistake.
From other hand I'm still testing rnn for cmv for first three files.
Rsults are interesting but it's still few days to finish.
Yes. Sorry for mistake.
From other hand I'm still testing rnn for cmv for first three files.
Rsults are interesting but it's still few days to finish.
Can someone compile the cmix v17 build for Windows? https://github.com/byronknoll/cmix/releases/tag/v17
Build target x64/SSE4. Not tested. IntelC19 and gcc82/mingw.
CompressMaster (29th April 2019),moisesmcardona (7th April 2019),xinix (7th April 2019)
There is Byron's Windows compile on official cmix page and it works fine: http://www.byronknoll.com/cmix.html
CompressMaster (29th April 2019),moisesmcardona (7th April 2019)
Byron - are yours test scores on official cmix page made with new dictionary file included in cmix_v17 package?
Darek (8th April 2019)
Here are scores of my testset for cmix v17 vs. latest cmix v16f. In total there is a 8450 bytes loss, mainly for biggest, non-textual files.
Pure textual files got very nice gain = 0.35% which is impressive.
p.s. scores aren't optimal yet - I've based on v16e/f optimal options.
I'm testing now other options (there are four: 1) just -c, 2) -c english.dic, 3) -s english.dic then -c and 4) -s english.dic then -c english.dic).
I'll try to test v17 with older dictionary (from v16).
Last edited by Darek; 8th April 2019 at 04:21.
Final scores of my testset for cmix v17 - looks that this version have much different characteristics than v16f - different options got best scores.
@mpais - request - could you add LZW transform from paq8px v178 for TGA and TIFF files to cmix? Is it possible?
Configurable adam learning rate, 2019/04/14:
- Why didn't you write directly learning_rate instead of learning_rate * 0.067?
- Will you use learning_rate statically (a fixed value) or dynamically (a value that changes during compression)?
- How about configuring also sqrt(5e-5 * t + 1)?
- In my tests sometimes beta1 = 0.7/0.8 is better than 0.9.
- https://bellard.org/nncp/nncp.pdf, 2.3 Training details: "We use the Adam optimizer [5] with beta1 = 0, beta2 = 0.9999 and eps = 10^-5. No gradient clipping is done.".
cmix actually uses two optimizers: gradient descent for the final softmax layer, and adam for everything else. My LSTM implementation takes a single learning rate parameter as input, which was used for the gradient descent (the adam parameters were just hardcoded in place to avoid too many configurable parameters). The latest commit is mostly refactoring: it uses the same learning rate parameter for both gradient descent and adam (with a multiplier in adam to make it close to the original hardcoded value).
"learning_rate_" is a fixed value (but "alpha" has the sqrt(5e-5 * t + 1) decay).
I guess that could also be made into a configurable parameter - but I also want to avoid adding too many parameters to reduce complexity (i.e. only expose parameters that would make a significant difference when tuning for some new data set).
Yeah, I will continue experimenting with tuning these parameters. In NNCP the note is also added: "In our implementation, eps is added to the running average before taking the square root"
Stupid question, but I need to ask.
I need to compress files with latest cmix release (v17), but drag-and-drop method doesn´t work. What´s the correct command? I´m using Windows 7.
Thanks.
CompressMaster
So eitherCode:\cmix\src>cmix.exe cmix version 12 With preprocessing: compress: cmix -c [dictionary] [input] [output] only preprocessing: cmix -s [dictionary] [input] [output] decompress: cmix -d [dictionary] [input] [output] Without preprocessing: compress: cmix -c [input] [output] decompress: cmix -d [input] [output]
cmix -c english.dic input output
or
cmix -c input output
Keep in mind that its good to have >32GB of RAM for it to work.
CompressMaster (29th April 2019)
I have encountered critical problem when I´ve tried to compress my textfile with cmix 17 (March 2019). Screenshot is attached. Seems to be an issue with memory (I have 4GB). Where´s the problem? Thanks.
CompressMaster
CompressMaster (7th August 2019)
You can also compile it with -DDEFAULT_OPTION=4 or something.
Also this:AddByteModel(new PPMD::PPMD(6, 1200, manager_.bit_context_, vocab_));
AddByteModel(new PPMD::PPMD(16, 1200, manager_.bit_context_, vocab_));
1200 here is allocated memory in MB, can be reduced too.
Maybe a few other places.
Of course, the compression would be worse than normal cmix.
But maybe still better than paq8?
cmix v18 2019/08/01
http://www.byronknoll.com/cmix.html
https://github.com/byronknoll/cmix
https://github.com/byronknoll/cmix/releases
Changes from version 17 to version 18:
- LSTM improvements
@Byron: if it is not a problem, could you announce in this thread when you release a new version of cmix? TIA
Darek (3rd August 2019)
Sure. Here are some results:
Compressed size of enwik8: 14838332 bytes
Compressed size of enwik9: 115714367 bytes
Size of source code as a zip file: 208,961 bytes
zip file contains:
- all source code
- makefile
- dictionary
enwik9 compression time: 602867.49 seconds
enwik9 decompression time: 601569.89 seconds
Approximate memory used: 25738196 KiB
Here are the Silesia results for "precomp v0.4.7 -cn | cmix v18"
dicke: 1813095
mozil: 6717412
mr: 1829883
nci: 792994
ooff: 1226244
osdb: 1962336
reym: 712062
samba: 1614935
sao: 3727061
webst: 4297002
x-ray: 3508509
xml: 236101
total: 28437634
Darek (4th August 2019),Mauro Vezzosi (3rd August 2019),Mike (4th August 2019)
Scores for my testset - 13KB of gain - nice, expecially due to "only" LSTM improvenments!
Good scores for bigger and exe files.
cmix v18 scores for 4 corpuses - mostly based on Byron's scores (thanks to post it on cmix page!).
In general for bigger coruses there are significant gain. For smaller files (Calgery, Canterbury) not so big...
One advantage over v17 version is a little sppedup (on my laptop) = compressing times of cmix v18 version are about 96% of cmix v17 - then there are good compression gain and a little sppedup in the same version!
In the second table there are the best scores for the 4 Corpuses on August 2019![]()
How to compile cmix on devcpp++ ? I am interested in data compression too beside computer vision. This is my little project on computer vision under Android.
https://youtu.be/ofwEuCclswM
bwt (7th November 2019)
Just download http://www.byronknoll.com/cmix-v18.zip and compile normally?
It compiles with
find . -iname '*.cpp' >list
g++ @list
so just add all cpp files to a project or something?
devcpp is 4 years old.
I haven't tried, but I suspect it won't compile cmix out-of-the-box.
You should switch to a more recent development environment.
I tried and cmix18 compiles with gcc 5.10, but with "g++ -std=gnu++0x -static @list", there're errors without "-std"
Gotty (6th November 2019)
makefile is not a shell script, different syntax.
Unpack attached scripts to cmix\src\, update gcc path in g.bat (C:\MinGW510\bin to your path with g++.exe), then run g.bat.
Cmix commit 2019/12/05, changing from layer_norm to rms_norm: yes, rms_norm looks better than layer_norm.
How much does cmix (or lstm-compress) improve?
Mauro Vezzosi (9th December 2019)