Code:
Next info
2017/08/06
- Made 0.2.0 beta 1 standard and extreme versions.
2017/08/06
- Added the model "Previous seen bit", order 0 (history 1 and 2), 1 and 2.
Always enabled only for methods >= 1.
2017/07/29
- Removed Bit 28 ((mixer2(mixerN_fast_lr(), mixerN_slow_lr()))) and bit 29 (mixerN SSE) from "*" methods options (they aren't models).
2017/07/26
Benchmarks for version 0.1.1, Squeeze Chart (PDF, JPG, MP3, PNG, Installer.. (Compressing Already Compressed Files)), result not verified.
-m1,3,+ -m2,0,+ -m2,0,*
Documents
5.445.616 5.452.943 5.446.604 busch2.epub
87.474.099 84.864.362 diato.sf2
38.532.255 38.534.942 freecol.jar
12.582.707 12.571.833 maxpc.pdf
85.128.562 SoniMusicae-Diato-sf2.zip
791.766 791.372 791.405 squeeze.xlsx
Image Formats (Camera Raw)
22.065.306 22.043.852 canon.cr2
11.325.973 11.109.503 fuji.raf
33.826.108 leica.dng
32.532.433 32.532.436 nikon.nef
15.862.814 15.859.052 15.856.336 oly.orf
16.110.809 pana.rw2
11.947.899 11.978.628 sigma.x3f
15.032.909 14.945.507 14.669.162 sony.arw
20.438.149 20.288.840 sony2.arw
Image Formats (web)
1.782.552 1.781.843 1.773.735 filou.gif
4.638.655 4.642.178 4.639.766 flumy.png
6.828.506 6.834.776 6.828.513 mill.jpg
Installers
101.253.297 102.250.965 amd.run
189.818.215 184.804.805 cab.tar
54.115.084 54.089.566 inno.exe
18.551.343 18.528.542 18.526.117 setup.msi
53.701.756 54.112.501 wise.exe
Interactive files
340.557 340.066 flyer.msg
5.900.341 5.898.822 swf.tar
Scientific Data
26.924 14.915 7.167 block.hex
167.944.593 147.688.356 msg_lu.trace
116.023.192 110.989.967 num_brain.trace
36.433.304 34.696.885 obs_temp.trace
Songs (Tracker Modules)
6.645.290 6.501.083 6.456.087 it.it
15.791.191 15.354.477 mpt.mptm
7.428.792 7.286.070 7.180.942 xm.xm
Songs (web)
19.246.345 19.245.422 aac.aac
16.508.192 16.520.159 16.507.079 diatonis.wma
127.728.723 127.419.074 mp3corpus.tar
36.334.963 36.332.287 ogg.ogg
Videos (web)
15.142.319 15.139.010 15.133.624 a55.flv
32.153.051 32.151.082 h264.mkv
162.890.183 162.883.854 star.mov
96.324.466 96.480.593 van_helsing.ts
-m1,3,+ -m2,3 -mx Squeeze Chart
68.364.894 67.828.879 65.646.788 squeezechart_app.tar (tarball)
2017/07/23
- -m2,,- (-m2 with less options): changed Order 0..1 bit history (bit 24-25) from 0 to 1.
2017/05/23
Benchmarks for version in development (2017/05/23 >0.2.0a5, VOM model is disabled, mod_ppmd enabled), compared to the last official (0.1.1) and previous development (2016/06/08 0.2.0 ? and 2016/11/28 >0.2.0a3 (VOM model is disabled)) versions.
0.1.1 -mx 0.2.0 ? -mx >0.2.0 a3 -mx >0.2.0 a5 -mx Maximum Compression
2016/01/10 2016/06/08 2016/11/28 2017/05/23
820.501 819.931 819.582 819.172 A10.jpg
1.017.441 997.721 976.062 974.342 AcroRd32.exe
400.343 381.471 375.731 371.535 english.dic
3.574.241 3.566.950 3.555.244 3.554.780 FlashMX.pdf
280.132 267.665 258.477 256.850 FP.LOG
1.387.079 1.365.962 1.338.128 1.335.340 MSO97.DLL
687.731 683.352 678.767 677.244 ohs.doc
653.728 634.380 633.578 633.062 rafale.bmp
399.793 391.522 386.441 385.870 vcfiu.hlp
372.831 362.592 356.292 352.845 world95.txt
9.593.820 9.471.546 9.378.302 9.361.040 Total
9.632.713 9.515.604 9.386.444 9.366.438 MaxCompr.tar (very close to single file compression total :-))
100.031 100.031 sharnd_challenge.dat
34 34 Test_000
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0 a3 -mx >0.2.0 a5 -mx Large Text Compression Benchmark
2016/01/10 (>&b12) 2016/06/08 2016/11/28 2017/05/23
24 24 24 24 ENWIK0
31 32 32 32 ENWIK1
91 89 88 89 ENWIK2
299 287 284 286 ENWIK3
2.997 2.953 2.922 2.925 ENWIK4
25.568 25.248 25.076 24.998 ENWIK5
214.137 211.478 209.352 208.326 ENWIK6
1.968.717 1.941.513 1.915.141 1.900.749 ENWIK7
18.153.319 17.898.994 17.650.885 17.325.679 ENWIK8
16.677.314 ENWIK8.drt (-m2,3,0x53e90df7)
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0 a3 -mx >0.2.0 a5 -mx Other
2016/01/10 (>&b12) 2016/06/08 2016/11/28 2017/05/23
194.417 192.013 191.081 190.221 book1
613.782 605.705 599.057 597.065 Calgary corpus.tar
617.996 603.261 601.940 Calgary corpus.tar.paq8pxd16
331.931 327.901 325.392 324.517 Canterbury corpus.tar
333.145 326.907 326.382 Canterbury corpus.tar.paq8pxd16
>0.2.0 a5 -m2,1,* Squeeze Chart - http://www.squeezechart.com
2017/05/23
6.684.067 MKVtoonix-GUI 10.0 64 Bit
0.1.1 -mx >0.2.0 a5 -mx Wratislavia XML Corpus - http://pskibinski.pl/research/Wratislavia/
2016/01/10 2017/05/23
1.103.033 1.074.470 shakespeare.xml
56.836 54.461 uwm.xml
0.1.1 -m2,3,0x03ededff >0.2.0 a3 -mx >0.2.0 a5 -mx 10 GB Compression Benchmark
2016/01/10 (>&b12) 2016/11/28 2017/05/23
33.024.880 33.072.180 32.621.364 100mb.tar (100mb subset) (tarball)
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0 a3 -mx >0.2.0 a5 -mx Compression Competition -- $15,000 USD
2016/01/10 (>&b12) 2016/06/08 2016/11/01 2017/05/23
14.897.825 14.782.006 14.737.701 14.710.428 SRR062634.filt.fastq
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0 a3 -mx >0.2.0 a5 -mx Specific case - High redundant data in a pattern
2016/01/10 (>&b12) 2016/06/08 2016/11/28 2017/05/23
1.242 1.282 1.200 1.235 NUM.txt
0.1.1 -m2,3,0x03ededff PPMonstr J >0.2.0 a3 -mx >0.2.0 a5 -mx Generic Compression Benchmark
2016/01/10 (>&b12) 2006/02/16 2016/11/28 2017/05/23
2.923.969 3.003.153 2.865.327 2.858.429 uiq2-seed'0'-n1000000
0,97363305 1,00000000 0,95410623 0,95180932 Size (Ratio)
Prime Number Benchmark
msb lsb dbl hex text twice interlv delta bytemap bitmap total tarball tarball/total (total = 4.803.388; tarball = 4.812.800, 7z.exe a -ttar)
39.547 39.314 40.403 41.851 39.019 39.569 40.831 34.251 25.789 32.110 372.684 292.304 0,78432130169 cmv -mx 0.1.1
39.379 39.181 40.011 41.629 38.829 39.400 40.548 33.999 23.851 31.659 368.486 290.754 0,78905033027 cmv -max 0.1.1
38.233 37.987 38.897 41.233 38.301 38.261 39.570 33.944 24.149 31.909 362.484 284.835 0,78578640712 cmv -mx 0.2.0 (2016/06/08)
37.963 37.679 38.711 40.893 38.131 37.995 39.136 33.925 22.864 31.895 359.192 281.555 0,78385654469 cmv -mx >0.2.0 a3 (2016/11/28)
37.834 37.510 38.600 40.737 38.016 37.863 39.133 33.856 22.880 31.850 358.279 280.948 0,78415983075 cmv -mx >0.2.0 a5 -mx (2017/05/23)
37.835 37.973 38.597 41.619 38.829 37.894 36.255 33.726 23.851 29.527 356.106 Best overall (2016/01/14)
nz nz nz cmix cmv nz nz cmix cmv cmix
0.1.1 -mx 0.1.1 Optimal 0.2.0a3 ICM1 Opt. 0.2.0a3 Extreme Opt. >0.2.0 a3 -mx >0.2.0 a5 -mx >0.2.0 a5Opt. Darek's testbed
2016/01/10 2016/01/10 2016/09/27 2016/09/27 2016/11/01 2017/05/23 2017/05/23
1.404.493 1.399.464 1.381.660 1.380.299 1.386.222 1.384.535 1.380.937 0.WAV
323.277 322.990 303.175 301.866 303.864 303.490 302.942 1.BMP
835.277 834.675 744.125 734.102 745.829 744.815 743.806 A.TIF
780.311 779.596 692.949 683.189 693.092 692.099 690.734 B.TGA
327.568 325.838 318.478 318.919 321.261 321.066 318.102 C.TIF
310.952 310.480 303.235 303.056 303.979 303.659 302.614 D.TGA
497.089 496.837 496.142 494.695 496.487 496.368 496.055 E.TIF
110.914 110.799 110.755 110.571 110.809 110.798 110.744 F.JPG
1.367.023 1.366.851 1.358.405 1.356.314 1.357.977 1.356.525 1.356.251 G.EXE
482.720 482.701 462.836 463.150 461.994 460.785 460.361 H.EXE
227.898 227.791 217.797 217.981 217.765 217.459 217.096 I.EXE
43.364 43.325 43.172 43.179 43.173 43.178 43.151 J.EXE
2.618.200 2.616.775 2.540.011 2.536.718 2.534.630 2.531.528 2.532.372 K.WAD
2.830.322 2.829.223 2.751.490 2.745.121 2.751.856 2.748.195 2.746.108 L.PAK
55.438 55.027 52.420 51.990 52.617 52.376 51.634 M.DBF
86.689 86.688 83.967 83.837 83.811 83.553 83.457 N.ADX
3.777 3.775 3.672 3.668 3.670 3.678 3.669 O.APR
947 938 878 879 912 909 880 P.FM3
187.547 187.547 170.171 168.828 169.843 169.048 168.401 Q.WK3
29.327 29.245 28.781 28.768 28.782 28.725 28.632 R.DOC
26.020 25.970 25.447 25.435 25.418 25.359 25.261 S.DOC
18.752 18.727 18.367 18.367 18.285 18.274 18.232 T.DOC
8.681 8.646 8.536 8.532 8.530 8.535 8.506 U.DOC
18.570 18.519 18.213 18.206 18.198 18.165 18.072 V.DOC
13.228 13.185 13.007 13.002 12.995 12.988 12.930 W.DOC
11.049 10.997 10.854 10.857 10.863 10.848 10.792 X.DOC
323 320 314 314 318 318 315 Y.CFG
176 171 166 166 169 171 166 Z.MSG
12.619.932 12.607.100 12.159.023 12.122.009 12.163.349 12.147.447 12.132.220 Total
12.166.965 12.152.547 Testbed.tar (quite close to single file compression total :-))
0.1.1 -m2,3,0x03ededff >0.2.0 a5 -mx Other 2
2016/01/10 (>&b12) 2017/05/23
175.282 160.565 AIMP_free.tga
5.265.328 5.215.493 FFADMIN.EXE
46.949 43.339 _FOSSIL_
0.1.1 -mx >0.2.0 a5 -m2,0,+|0x18000000 >0.2.0 a5 -m2,0,* Squeeze Chart (Txt Bible (Compressing Text In Different Languages))
2016/01/10 2017/05/23 2017/05/23
693.479 679.334 671.851 afri.txt
746.900 729.857 722.773 alb.txt
624.237 612.327 608.960 ara.txt
705.168 690.952 687.040 chi.txt
922.105 903.396 891.634 cro.txt
749.296 733.615 728.593 cze.txt
695.223 681.642 675.363 dan.txt
729.682 715.570 707.159 dut.txt
659.272 646.581 638.960 eng.txt
649.197 637.874 629.684 esp.txt
723.353 709.038 703.339 fin.txt
692.336 677.031 669.252 fre.txt
717.826 702.885 696.772 ger.txt
221.765 218.525 212.385 gre.txt
625.971 617.692 605.519 heb.txt
782.307 764.490 760.277 hun.txt
729.490 715.866 707.533 ita.txt
637.865 623.902 621.463 kor.txt
816.845 800.804 791.440 lat.txt
676.332 660.210 653.353 lit.txt
678.285 666.325 656.289 mao.txt
694.682 681.233 674.540 nor.txt
712.650 699.289 693.137 por.txt
712.598 697.016 690.182 rom.txt
744.556 725.742 721.408 rus.txt
708.648 695.153 687.979 spa.txt
758.391 740.013 731.808 swe.txt
687.045 671.278 663.161 tag.txt
778.152 750.736 748.526 thai.txt
662.605 645.354 640.555 turk.txt
712.785 696.119 692.605 vie.txt
715.801 699.326 692.804 xho.txt
22.364.847 21.889.175 21.676.344 Total
2017/??/??
- Added Shelwien's mod_ppmd in already existent PPM low orders model (bit 27).
Method: 0: Order 4. 1: Order 6. 2: Order 16.
Memory: -m,0 -m,1 -m,2 -m,3.
-m0: Mb 8 16 32 64.
-m1: Mb 32 64 128 256.
-m2: Mb 128 256 512 1024.
(Memory = MB(8 << (method * 2 + memory)))
- Changed ICM (indirect context models): from 5+5 counters + 3 bit history and "full" mixer to 6+6 counters + 2 bit history and quantized mixer.
- Extreme version: added SSE in the mixer of 2 models (main model (mixallmixmixp3[]) and Gap model 2) when bit 28 and 29 are switched on.
2017/01/17
- Made 0.2.0 alpha 5 standard and extreme versions.
2017/01/??
- Extreme version: Final N input mixer (bit 20-21, option 2): chained the predictions of the second layer of the mixers tree 15->3->1.
2017/01/05
- Extreme version: Final N input mixer (bit 20-21, option 2): 20 bits precision was wrong (see 2016/12/29), decreased to 19 bits.
2016/12/31
- Extreme version: Final N input mixer (bit 20-21, option 2): changed the mixers tree from 6->3->1 to 15->3->1 (experimental modify, maybe can be changed or deleted in future).
2016/12/29
- Extreme version: Final N input mixer (bit 20-21, option 2): more precision in mixer2 (from 16 to 20 bits precision) (mixer2(mixerN_fast_lr(), mixerN_slow_lr())) (very small gain).
2016/12/??
- I made some attempts to improved the mixer, but I failed except in weight initialization (very small gain).
- Final N input mixer (bit 20-21, option 2): changed (fixed) initialization of the second level mixer (now it seems to be little bit worse).
2016/11/28
Benchmarks for version in development (2016/11/28 >0.2.0a3, VOM model is disabled), compared to the last official (0.1.1) and previous development (2016/06/08 0.2.0 ?) versions.
0.1.1 -mx 0.2.0 ? -mx >0.2.0 a3 -mx Maximum Compression
2016/01/10 2016/06/08 2016/11/28
820.501 819.931 819.582 A10.jpg
1.017.441 997.721 976.062 AcroRd32.exe
400.343 381.471 375.731 english.dic
3.574.241 3.566.950 3.555.244 FlashMX.pdf
280.132 267.665 258.477 FP.LOG
1.387.079 1.365.962 1.338.128 MSO97.DLL
687.731 683.352 678.767 ohs.doc
653.728 634.380 633.578 rafale.bmp
399.793 391.522 386.441 vcfiu.hlp
372.831 362.592 356.292 world95.txt
9.593.820 9.471.546 9.378.302 Total
9.632.713 9.515.604 9.386.444 MaxCompr.tar (very close to single file compression total :-))
100.031 sharnd_challenge.dat
34 Test_000
0.1.1 -mx 0.2.0 ? -mx 0.2.0a2 0.2.0a3 >0.2.0 a3 -mx Silesia Open Source Compression Benchmark
2016/01/10 2016/06/08 2016/07/02-06 2016/09-10 2016/11/28
2.047.734 2.016.906 2.015.523 2.003.326 1.999.489 dickens
10.117.960 9.958.936 9.942.493 9.812.955 9.800.725 mozilla
2.065.773 2.004.632 2.003.968 2.002.364 2.003.809 mr
969.798 932.177 931.890 925.225 920.860 nci
1.698.391 1.662.164 1.658.048 1.625.486 1.622.001 ooffice
2.081.744 2.052.039 2.052.089 2.042.336 2.042.487 osdb
830.294 813.002 812.739 804.606 799.896 reymont
2.799.600 2.764.836 2.762.207 2.740.579 2.725.908 samba
3.807.684 3.776.032 3.775.953 3.764.244 3.764.346 sao
5.292.616 5.184.527 5.177.512 5.126.226 5.080.963 webster
3.577.455 3.556.659 3.555.720 3.554.770 3.556.252 x-ray
281.484 276.571 276.435 275.044 272.247 xml
35.570.533 34.998.481 34.964.577 34.677.161 34.588.983 Total
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0 a3 -mx Large Text Compression Benchmark
2016/01/10 (>&b12) 2016/06/08 2016/11/28
24 24 24 ENWIK0
31 32 32 ENWIK1
91 89 88 ENWIK2
299 287 284 ENWIK3
2.997 2.953 2.922 ENWIK4
25.568 25.248 25.076 ENWIK5
214.137 211.478 209.352 ENWIK6
1.968.717 1.941.513 1.915.141 ENWIK7
18.153.319 17.898.994 17.650.885 ENWIK8
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0 a3 -mx Other
2016/01/10 (>&b12) 2016/06/08 2016/11/28
194.417 192.013 191.081 book1
613.782 605.705 599.057 Calgary corpus.tar
617.996 603.261 Calgary corpus.tar.paq8pxd16
331.931 327.901 325.392 Canterbury corpus.tar
333.145 326.907 Canterbury corpus.tar.paq8pxd16
0.1.1 -m2,3,0x03ededff >0.2.0 a3 -mx 10 GB Compression Benchmark
2016/01/10 (>&b12) 2016/11/28
33.024.880 33.072.180 100mb.tar (100mb subset) (tarball) (worse than 0.1.1 :-()
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0 a3 -mx Compression Competition -- $15,000 USD
2016/01/10 (>&b12) 2016/06/08 2016/11/01
14.897.825 14.782.006 14.737.701 SRR062634.filt.fastq
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0 a3 -mx Specific case - High redundant data in a pattern
2016/01/10 (>&b12) 2016/06/08 2016/11/28
1.242 1.282 1.200 NUM.txt
0.1.1 -m2,3,0x03ededff >0.2.0 a3 -mx Testing compressors with artificial data
2016/01/10 (>&b12) 2016/11/28
1.000.048 1.000.070 a6000
875.600 875.617 a6001
500.888 501.254 a6004
126.131 126.717 a6007
71 95 b6002
71 84 b6004
73 75 b6010
148 93 b6100
23 23 c0000
68 67 c6000
74 118 c6001
101 67 c6255
200 162 d6001
80 166 d6016
71 105 d6128
179 204 i6002
181 212 i6004
265 190 i6010
488 421 i6100
247 175 l6002
214 212 l6003
211 200 l6004
229 192 l6008
387 260 m6002
203 240 m6003
339 296 m6004
386 383 m6008
500.830 500.645 p6002
501.924 500.545 p6004
503.587 500.614 p6010
515.730 511.624 p6100
500.575 500.743 r6002
250.986 250.548 r6004
101.881 100.584 r6010
13.215 11.642 r6100
500.618 500.860 s6002
250.616 250.640 s6004
101.177 100.676 s6010
11.404 10.625 s6100
5.043 5.037 w4002
50.077 50.048 w5002
500.185 500.114 w6002
100.296 100.121 w6010
10.235 10.123 w6100
5.000.429 5.000.739 w7002
50.076.893 96.493.563 w8002
62.002.677 108.407.189 Total
0.1.1 -m2,3,0x03ededff PPMonstr J >0.2.0 a3 -mx Generic Compression Benchmark
2016/01/10 (>&b12) 2006/02/16 2016/11/28
2.923.969 3.003.153 2.865.327 uiq2-seed'0'-n1000000
2.924.115 3.002.768 2.865.090 uiq2-seed'1'-n1000000
2.924.322 3.003.251 2.865.234 uiq2-seed'2'-n1000000
2.925.567 3.004.457 2.866.751 uiq2-seed'3'-n1000000
2.922.792 3.001.321 2.864.111 uiq2-seed'4'-n1000000
2.924.509 3.003.981 2.865.290 uiq2-seed'5'-n1000000
2.924.659 3.003.513 2.865.911 uiq2-seed'6'-n1000000
2.924.616 3.003.203 2.865.552 uiq2-seed'7'-n1000000
2.925.579 3.004.358 2.867.011 uiq2-seed'8'-n1000000
2.926.293 3.004.433 2.867.016 uiq2-seed'9'-n1000000
29.246.421 30.034.438 28.657.293 Total
0,97376289 1,00000000 0,95414780 Size (Ratio)
Prime Number Benchmark
msb lsb dbl hex text twice interlv delta bytemap bitmap total tarball tarball/total (total = 4.803.388; tarball = 4.812.800, 7z.exe a -ttar)
39.547 39.314 40.403 41.851 39.019 39.569 40.831 34.251 25.789 32.110 372.684 292.304 0,78432130169 cmv -mx 0.1.1
39.379 39.181 40.011 41.629 38.829 39.400 40.548 33.999 23.851 31.659 368.486 290.754 0,78905033027 cmv -max 0.1.1
38.233 37.987 38.897 41.233 38.301 38.261 39.570 33.944 24.149 31.909 362.484 284.835 0,78578640712 cmv -mx 0.2.0 (2016/06/08)
37.963 37.679 38.711 40.893 38.131 37.995 39.136 33.925 22.864 31.895 359.192 281.555 0,78385654469 cmv -mx >0.2.0 a3 (2016/11/28)
37.835 37.973 38.597 41.619 38.829 37.894 36.255 33.726 23.851 29.527 356.106 Best overall (2016/01/14)
nz nz nz cmix cmv nz nz cmix cmv cmix
0.1.1 -mx 0.1.1 Optimal 0.2.0a3 ICM1 Opt. 0.2.0a3 Extreme Opt. >0.2.0 a3 -mx Darek's testbed
2016/01/10 2016/01/10 2016/09/27 2016/09/27 2016/11/01
1.404.493 1.399.464 1.381.660 1.380.299 1.386.222 0.WAV
323.277 322.990 303.175 301.866 303.864 1.BMP
835.277 834.675 744.125 734.102 745.829 A.TIF
780.311 779.596 692.949 683.189 693.092 B.TGA
327.568 325.838 318.478 318.919 321.261 C.TIF
310.952 310.480 303.235 303.056 303.979 D.TGA
497.089 496.837 496.142 494.695 496.487 E.TIF
110.914 110.799 110.755 110.571 110.809 F.JPG
1.367.023 1.366.851 1.358.405 1.356.314 1.357.977 G.EXE
482.720 482.701 462.836 463.150 461.994 H.EXE
227.898 227.791 217.797 217.981 217.765 I.EXE
43.364 43.325 43.172 43.179 43.173 J.EXE
2.618.200 2.616.775 2.540.011 2.536.718 2.534.630 K.WAD
2.830.322 2.829.223 2.751.490 2.745.121 2.751.856 L.PAK
55.438 55.027 52.420 51.990 52.617 M.DBF
86.689 86.688 83.967 83.837 83.811 N.ADX
3.777 3.775 3.672 3.668 3.670 O.APR
947 938 878 879 912 P.FM3
187.547 187.547 170.171 168.828 169.843 Q.WK3
29.327 29.245 28.781 28.768 28.782 R.DOC
26.020 25.970 25.447 25.435 25.418 S.DOC
18.752 18.727 18.367 18.367 18.285 T.DOC
8.681 8.646 8.536 8.532 8.530 U.DOC
18.570 18.519 18.213 18.206 18.198 V.DOC
13.228 13.185 13.007 13.002 12.995 W.DOC
11.049 10.997 10.854 10.857 10.863 X.DOC
323 320 314 314 318 Y.CFG
176 171 166 166 169 Z.MSG
12.619.932 12.607.100 12.159.023 12.122.009 12.163.349 Total
12.166.965 Testbed.tar (very close to single file compression total :-))
2016/11/??
- Word model (bit 9-10):
- Improved non-word models.
- Added new model to option 1: distance from nth character after a LF, 30 predictors/characters.
- Added new model to option 1: vowel/consonant/other, order 4 and 7, 3 * 2 = 6 predictors.
2016/11/04
- ICM: better hash calculation --> bit better precision in predictions.
- Word model (bit 9-10): now all predictors use new ICM --> bit better predictions.
- Extreme version: added new 8 predictors in the delta model.
Benchmarks for version in development (2016/11/01 >0.2.0a3, VOM model is disabled), compared to the last official (0.1.1) and previous development (2016/06/08) versions.
0.1.1 -mx 0.2.0 -mx >0.2.0.a3 -mx Maximum Compression
2016/01/10 2016/06/08 2016/11/01
820.501 819.931 819.620 A10.jpg
1.017.441 997.721 977.181 AcroRd32.exe
400.343 381.471 376.271 english.dic
3.574.241 3.566.950 3.557.998 FlashMX.pdf
280.132 267.665 262.734 FP.LOG
1.387.079 1.365.962 1.339.122 MSO97.DLL
687.731 683.352 679.068 ohs.doc
653.728 634.380 633.386 rafale.bmp
399.793 391.522 387.262 vcfiu.hlp
372.831 362.592 358.775 world95.txt
9.593.820 9.471.546 9.391.417 Total
9.632.713 9.515.604 9.400.385 MaxCompr.tar (very close to single file compression total :-))
100.031 sharnd_challenge.dat.cmv
34 Test_000.cmv
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0.a3 -mx Large Text Compression Benchmark
2016/01/10 (>&b12) 2016/06/08 2016/11/01
24 24 24 ENWIK0
31 32 32 ENWIK1
91 89 88 ENWIK2
299 287 285 ENWIK3
2.997 2.953 2.937 ENWIK4
25.568 25.248 25.174 ENWIK5
214.137 211.478 210.225 ENWIK6
1.968.717 1.941.513 1.921.495 ENWIK7
18.153.319 17.898.994 17.692.364 ENWIK8
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0.a3 -mx Other
2016/01/10 (>&b12) 2016/06/08 2016/11/01
194.417 192.013 191.470 book1
613.782 605.705 600.980 Calgary corpus.tar
617.996 604.811 Calgary corpus.tar.paq8pxd16
331.931 327.901 326.053 Canterbury corpus.tar
333.145 327.473 Canterbury corpus.tar.paq8pxd16
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0.a3 -mx Compression Competition -- $15,000 USD
2016/01/10 (>&b12) 2016/06/08 2016/11/01
14.897.825 14.782.006 14.757.363 SRR062634.filt.fastq
0.1.1 -m2,3,0x03ededff 0.2.0 -mx >0.2.0.a3 -mx Specific case - High redundant data in a pattern
2016/01/10 (>&b12) 2016/06/08 2016/11/01
1.242 1.282 1.243 NUM.txt
Prime Number Benchmark
msb lsb dbl hex text twice interlv delta bytemap bitmap total tarball tarball/total (total = 4.803.388; tarball = 4.812.800, 7z.exe a -ttar)
39.547 39.314 40.403 41.851 39.019 39.569 40.831 34.251 25.789 32.110 372.684 292.304 0,78432130169 cmv -mx 0.1.1
39.379 39.181 40.011 41.629 38.829 39.400 40.548 33.999 23.851 31.659 368.486 290.754 0,78905033027 cmv -max 0.1.1
38.233 37.987 38.897 41.233 38.301 38.261 39.570 33.944 24.149 31.909 362.484 284.835 0,78578640712 cmv -mx 0.2.0 (2016/06/08)
38.037 37.772 38.759 40.902 38.133 38.068 39.301 33.914 22.814 31.880 359.580 281.752 0,78355859614 cmv -mx >0.2.0a3 (2016/11/01)
37.835 37.973 38.597 41.619 38.829 37.894 36.255 33.726 23.851 29.527 356.106 Best overall (2016/01/14)
nz nz nz cmix cmv nz nz cmix cmv cmix
2016/10/??
- Start of maintaining also the CMV Extreme version.
It's CMV64 with 4x memory and some improvement I made during the development of CMV, but they are disabled because they improve only slightly the compression ratio and I don't want to take so much time or memory for very small gain in the standard version.
- N imput mixer: works with 32 bits instead of 16 bits. More precision most of the time, 2x memory.
- New ICM: context hash bucket works with 64 contexts instead of 16. More precision most of the time, it can takes up to 4x time to search data in the bucket, same memory.
- Gap model 2 (bit 14): it works with 36 (6x6) predictors instead of 16 (4x4), added order 4. ~~2x time, +2x memory.
- PPM (bit 27): added order 2. More time, more memory.
- Last bits: new silly model always enabled, it has 15 predictors, but gain is very small. More time, more memory.
- Extreme may need of 17 Gb RAM (I don't know if it's the maximum) and it is more than 10% slower.
2016/09/29
Benchmarks for version in development (2016/09/28), VOM model is disabled.
0.2.0a3 -mx Maximum Compression
819.594 A10.jpg
978.102 AcroRd32.exe
377.493 english.dic
3.558.708 FlashMX.pdf
265.594 FP.LOG
1.340.401 MSO97.DLL
679.124 ohs.doc
633.388 rafale.bmp
387.773 vcfiu.hlp
359.548 world95.txt
9.399.725 Total
100.031 sharnd_challenge.dat
34 Test_000
2016/09/??
Added Exe model (bit 30).
2016/08..09/??
Added new style ICM: it has more precision and reduces the collision of the context hash but handles half number of contexts.
At the moment I changed the main Order-N, the Word (bits 9-10), More sparse and masked models (bits 15-16) and Exe models (bit 30) to use the new ICM.
New ICM applied to Gap model 1 (bit 13) and Delta model (bit 26) hurts compression.
2016/08..09/??
- Sparse match model (bits 1-3): small improvement for option 7, to be continued.
- SSE (bit 29): very slightly improved.
~2016/07/??
The maximum order that Variable order and memory model (VOM) handles is reduced from 10 (the help was wrong, it says 9) to 8.
This should improve a little bit the compression ratio.
2016/07/02
I will be on holiday from 2016/07/09 to 2016/07/24, probably I won't read encode.su in those days.
2016/06/25
The first attempt to improve speed or compression ratio of ICM is failed.
Now I'm adding a byte-level PPM order 0, order 1 gap 0/1/2/3.
2016/06/10-2016/06/19
Benchmarks for version in development (2016/06/08 (2)) compared to the last official and development (2016/02/01 (1)) versions.
0.2.0 version (dev.(2)) will haven't "Variable order and memory model" (VOM) enabled by default in -mx (to save time and memory, sometimes it hurts compression ratio otherwise it saves only few bytes), you can add it with "-m2,3,>|b12".
0.1.1 -mx dev.(1) -mx dev.(2) -mx Maximum Compression
820.501 820.140 819.931 A10.jpg
1.017.441 998.416 997.721 AcroRd32.exe
400.343 383.246 381.471 english.dic
3.574.241 3.567.537 3.566.950 FlashMX.pdf
280.132 266.983 267.665 FP.LOG
1.387.079 1.366.681 1.365.962 MSO97.DLL
687.731 682.960 683.352 ohs.doc
653.728 635.473 634.380 rafale.bmp
399.793 392.447 391.522 vcfiu.hlp
372.831 366.705 362.592 world95.txt
9.593.820 9.480.588 9.471.546 Total
9.632.713 9.524.816 9.515.604 MaxCompr.tar
0.1.1 -mx dev.(1) -mx dev.(2) -mx Silesia Open Source Compression Benchmark (SOSCB)
2.047.734 2.023.802 2.016.906 dickens
10.117.960 9.966.096 9.958.936 mozilla
2.065.773 2.012.206 2.004.632 mr (SOSCB best)
969.798 933.615 932.177 nci
1.698.391 1.663.621 1.662.164 ooffice
2.081.744 2.049.413 2.052.039 osdb
830.294 815.091 813.002 reymont
2.799.600 2.773.724 2.764.836 samba
3.807.684 3.775.695 3.776.032 sao
5.292.616 5.201.770 5.184.527 webster
3.577.455 3.561.918 3.556.659 x-ray (SOSCB best)
281.484 277.915 276.571 xml
35.570.533 35.054.866 34.998.481 Total
0.1.1 -m2,3,0x03ededff dev.(1) -mx dev.(2) -mx Large Text Compression Benchmark
(>&b12)
24 24 24 ENWIK0
31 31 32 ENWIK1
91 88 89 ENWIK2
299 285 287 ENWIK3
2.997 2.947 2.953 ENWIK4
25.568 25.295 25.248 ENWIK5
214.137 211.863 211.478 ENWIK6
1.968.717 1.946.606 1.941.513 ENWIK7
18.153.319 17.928.046 17.898.994 ENWIK8
0.1.1 -mx dev.(1) -mx dev.(2) -mx Other
194.417 192.249 192.013 book1
613.782 606.341 605.705 Calgary corpus.tar
331.931 327.840 327.901 Canterbury corpus.tar
0.1.1 -mx dev.(1) -mx dev.(2) -mx Compression Competition -- $15,000 USD
14.897.825 14.769.089 14.782.006 SRR062634.filt.fastq
0.1.1 -mx dev.(1) -mx dev.(2) -mx 5 public files from Darek's testbed
1.404.493 1.391.704 1.389.245 0.WAV
310.952 306.922 306.998 D.TGA
482.720 472.887 472.193 H.EXE
187.547 179.161 179.542 Q.WK3
29.327 28.987 28.951 R.DOC
0.1.1 -m2,3,0x03ededff dev.(1) -mx Specific case - High redundant data in a pattern
(>&b12)
1.242 1.282 NUM.txt (I don't like this result)
Prime Number Benchmark
msb lsb dbl hex text twice interlv delta bytemap bitmap total tarball tarball/total (total = 4.803.388; tarball = 4.812.800, 7z.exe a -ttar)
39.547 39.314 40.403 41.851 39.019 39.569 40.831 34.251 25.789 32.110 372.684 292.304 0,78432130169 cmv -mx 0.1.1
38.233 37.987 38.897 41.233 38.301 38.261 39.570 33.944 24.149 31.909 362.484 284.835 0,78578640712 cmv -mx dev. (2)
39.379 39.181 40.011 41.629 38.829 39.400 40.548 33.999 23.851 31.659 368.486 290.754 0,78905033027 cmv -max 0.1.1
37.835 37.973 38.597 41.619 38.829 37.894 36.255 33.726 23.851 29.527 356.106 Best overall (2016/01/14)
nz nz nz cmix cmv nz nz cmix cmv cmix
2016/06/05
I finished to work on the word model:
- Changed from DCM to ICM.
- Now it handles 4x contexts.
- Current word contexts: wordbuf(0), wordbuf(0) + wordbuf(1), wordbuf(0) + wordbuf(2), wordbuf(0) + wordbuf(1) + wordbuf(2) + wordbuf(3).
- Now word and non-word models always returns a prediction.
- Deleted 3 secondary mixed predictions.
I changed some hash evaluations to improve a little bit the predictions and the speed of some models.
2016/05/21 (2016/09/20 Added sony2.arw)
Cmv 0.1.1, option -m2,0,+, Squeeze Chart (PDF, JPG, MP3, PNG, Installer.. (Compressing Already Compressed Files)), result not verified.
Documents
5.452.943 busch2.epub
84.864.362 diato.sf2
38.534.942 freecol.jar
12.571.833 maxpc.pdf
791.372 squeeze.xlsx
Image Formats (Camera Raw)
22.043.852 canon.cr2
11.109.503 fuji.raf
32.532.436 nikon.nef
15.859.052 oly.orf
11.978.628 sigma.x3f
14.945.507 sony.arw
20.288.840 sony2.arw
Image Formats (web)
1.781.843 filou.gif
4.642.178 flumy.png
6.834.776 mill.jpg
Installers
102.250.965 amd.run
184.804.805 cab.tar
54.089.566 inno.exe
18.528.542 setup.msi
54.112.501 wise.exe
Interactive files
340.066 flyer.msg
5.898.822 swf.tar
Scientific Data
14.915 block.hex
147.688.356 msg_lu.trace
110.989.967 num_brain.trace
34.696.885 obs_temp.trace
Songs (Tracker Modules)
6.501.083 it.it
15.354.477 mpt.mptm
7.286.070 xm.xm
Songs (web)
19.245.422 aac.aac
16.520.159 diatonis.wma
127.419.074 mp3corpus.tar
36.332.287 ogg.ogg
Videos (web)
15.139.010 a55.flv
32.151.082 h264.mkv
162.883.854 star.mov
96.480.593 van_helsing.ts
2016/05/20
Cmv 0.1.1, option -mx, Squeeze Chart (Txt Bible (Compressing Text In Different Languages)), result not verified.
693.479 afri.txt
746.900 alb.txt
624.237 ara.txt
705.168 chi.txt
922.105 cro.txt
749.296 cze.txt
695.223 dan.txt
729.682 dut.txt
659.272 eng.txt
649.197 esp.txt
723.353 fin.txt
692.336 fre.txt
717.826 ger.txt
221.765 gre.txt
625.971 heb.txt
782.307 hun.txt
729.490 ita.txt
637.865 kor.txt
816.845 lat.txt
676.332 lit.txt
678.285 mao.txt
694.682 nor.txt
712.650 por.txt
712.598 rom.txt
744.556 rus.txt
708.648 spa.txt
758.391 swe.txt
687.045 tag.txt
778.152 thai.txt
662.605 turk.txt
712.785 vie.txt
715.801 xho.txt
22.364.847 Total
2016/05/07
Cmv 0.1.1, vm.dll 1 2, result not verified.
820.313.136 -m2,3
2016/04/23
I'm working on the mixer and word model.
Current tests on SqueezeChart "Compressing Already Compressed Files" are very nice.
2016/04/08
Cmv 0.1.1, ENWIK9.DRT, result not verified.
140.808.418 -m2,3,0x03ededff
2016/04/01 (2016/04/08 Added R.DOC)
Benchmarks for version in development (2016/02/01) compared to the last official version.
0.1.1 -mx develop.-mx 5 public files from Darek's testbed
1.404.493 1.391.704 0.WAV
310.952 306.922 D.TGA
482.720 472.887 H.EXE
187.547 179.161 Q.WK3
29.327 28.987 R.DOC
2016/03/26
Cmv 0.1.1, test file 2547.gif posted in paq8px, results not verified.
126.907.065 -m0
125.064.783 -m1
125.045.518 -m2
124.811.418 -m2,0,+
91.144.579 Precomp v0.4.4 -cn | cmv -m0
77.591.645 Precomp v0.4.4 -cn | cmv -m1
72.082.466 Precomp v0.4.4 -cn | cmv -mx
2016/03/26
Thread pool needs a procedure rather long to be useful.
I don't implement it for the moment.
2016/02/20-27
Cmv 0.1.1, test files posted in Text strings coding chemical structures.
-m0 ratio bpb -mx ratio bpb Original
19 0,00000 0,0000 23 0,00000 0,0000 0 01+crlf-duplicity.smi
21 7,00000 56,0000 25 8,33333 66,6667 3 01+crlf-sorted-duplicity_included.smi
21 7,00000 56,0000 25 8,33333 66,6667 3 01+crlf-sorted-duplicity_removed.smi
21 7,00000 56,0000 25 8,33333 66,6667 3 01+crlf.smi
21 10,50000 84,0000 25 12,50000 100,0000 2 01.smi
19 0,00000 0,0000 23 0,00000 0,0000 0 02+crlf-duplicity.smi
30 2,14286 17,1429 32 2,28571 18,2857 14 02+crlf-sorted-duplicity_included.smi
30 2,14286 17,1429 32 2,28571 18,2857 14 02+crlf-sorted-duplicity_removed.smi
30 2,14286 17,1429 31 2,21429 17,7143 14 02+crlf.smi
28 2,54545 20,3636 29 2,63636 21,0909 11 02.smi
19 0,00000 0,0000 23 0,00000 0,0000 0 03+crlf-duplicity.smi
55 0,78571 6,2857 48 0,68571 5,4857 70 03+crlf-sorted-duplicity_included.smi
55 0,78571 6,2857 48 0,68571 5,4857 70 03+crlf-sorted-duplicity_removed.smi
56 0,80000 6,4000 48 0,68571 5,4857 70 03+crlf.smi
53 0,91379 7,3103 46 0,79310 6,3448 58 03.smi
19 0,00000 0,0000 23 0,00000 0,0000 0 04+crlf-duplicity.smi
119 0,37072 2,9657 92 0,28660 2,2928 321 04+crlf-sorted-duplicity_included.smi
119 0,37072 2,9657 93 0,28972 2,3178 321 04+crlf-sorted-duplicity_removed.smi
125 0,38941 3,1153 97 0,30218 2,4174 321 04+crlf.smi
121 0,43525 3,4820 96 0,34532 2,7626 278 04.smi
19 0,00000 0,0000 23 0,00000 0,0000 0 05+crlf-duplicity.smi
309 0,21150 1,6920 228 0,15606 1,2485 1.461 05+crlf-sorted-duplicity_included.smi
309 0,21150 1,6920 229 0,15674 1,2539 1.461 05+crlf-sorted-duplicity_removed.smi
350 0,23956 1,9165 248 0,16975 1,3580 1.461 05+crlf.smi
349 0,26784 2,1427 258 0,19800 1,5840 1.303 05.smi
19 0,00000 0,0000 23 0,00000 0,0000 0 06+crlf-duplicity.smi
1.416 0,13441 1,0753 870 0,08258 0,6607 10.535 06+crlf-sorted-duplicity_included.smi
1.398 0,13270 1,0616 858 0,08144 0,6515 10.535 06+crlf-sorted-duplicity_removed.smi
1.843 0,17494 1,3995 1.064 0,10100 0,8080 10.535 06+crlf.smi
1.872 0,19537 1,5629 1.109 0,11574 0,9259 9.582 06.smi
65 0,55556 4,4444 59 0,50427 4,0342 117 07+crlf-duplicity.smi
7.922 0,10099 0,8079 3.999 0,05098 0,4078 78.443 07+crlf-sorted-duplicity_included.smi
7.862 0,10038 0,8030 3.955 0,05049 0,4040 78.326 07+crlf-sorted-duplicity_removed.smi
11.011 0,14037 1,1230 5.510 0,07024 0,5619 78.443 07+crlf.smi
11.237 0,15520 1,2416 5.686 0,07853 0,6283 72.402 07.smi
222 0,18049 1,4439 174 0,14146 1,1317 1.230 08+crlf-duplicity.smi
48.522 0,08193 0,6554 20.112 0,03396 0,2717 592.237 08+crlf-sorted-duplicity_included.smi
48.143 0,08146 0,6517 19.878 0,03363 0,2691 591.007 08+crlf-sorted-duplicity_removed.smi
68.820 0,11620 0,9296 31.406 0,05303 0,4242 592.237 08+crlf.smi
69.956 0,12658 1,0127 31.789 0,05752 0,4602 552.648 08.smi
859 0,10592 0,8473 576 0,07102 0,5682 8.110 09+crlf-duplicity.smi
315.453 0,06851 0,5481 110.170 0,02393 0,1914 4.604.485 09+crlf-sorted-duplicity_included.smi
312.400 0,06797 0,5437 108.927 0,02370 0,1896 4.596.375 09+crlf-sorted-duplicity_removed.smi
459.274 0,09974 0,7980 197.709 0,04294 0,3435 4.604.485 09+crlf.smi
462.969 0,10687 0,8550 198.545 0,04583 0,3667 4.331.887 09.smi
5.687 0,07334 0,5867 3.202 0,04129 0,3303 77.546 10+crlf-duplicity.smi
2.162.672 0,05995 0,4796 694.588 0,01926 0,1540 36.072.267 10+crlf-sorted-duplicity_included.smi
2.145.700 0,05961 0,4769 687.312 0,01909 0,1528 35.994.721 10+crlf-sorted-duplicity_removed.smi
3.159.277 0,08758 0,7007 1.370.511 0,03799 0,3039 36.072.267 10+crlf.smi
3.190.041 0,09339 0,7471 1.388.889 0,04066 0,3253 34.157.176 10.smi
29.229 0,05760 0,4608 13.362 0,02633 0,2106 507.475 11+crlf-duplicity.smi
15.158.795 0,05249 0,4200 4.649.225 0,01610 0,1288 288.771.259 11+crlf-sorted-duplicity_included.smi
15.040.303 0,05218 0,4174 4.607.367 0,01598 0,1279 288.263.784 11+crlf-sorted-duplicity_removed.smi
22.382.099 0,07751 0,6201 9.912.052 0,03432 0,2746 288.771.259 11+crlf.smi
22.639.068 0,08236 0,6589 10.061.120 0,03660 0,2928 274.870.869 11.smi
211.497 0,04940 0,3952 83.257 0,01945 0,1556 4.281.332 12+crlf-duplicity.smi
1.558.991 0,04199 0,3359 517.891 0,01395 0,1116 37.129.750 13+crlf-duplicity.smi
89.516.939 0,06652 0,5321 34.733.065 0,02581 0,2065 1.345.800.583 Total
2016/02/13 (2016/02/14 Edit text)
I'm looking for implement thread pool (win32 thread): it's a nightmare.
2016/02/02 (2016/02/07 Added MaxCompr.tar, LTCB, Other. 2016/03/13 Added SRR062634.filt.fastq)
Benchmarks for version in development (2016/02/01) compared to the last official version.
0.1.1 -mx develop.-mx Maximum Compression
820.501 820.140 A10.jpg
1.017.441 998.416 AcroRd32.exe
400.343 383.246 english.dic
3.574.241 3.567.537 FlashMX.pdf
280.132 266.983 FP.LOG
1.387.079 1.366.681 MSO97.DLL
687.731 682.960 ohs.doc
653.728 635.473 rafale.bmp
399.793 392.447 vcfiu.hlp
372.831 366.705 world95.txt
9.593.820 9.480.588 Total
9.632.713 9.524.816 MaxCompr.tar
0.1.1 -mx develop.-mx Silesia Open Source Compression Benchmark (SOSCB)
2.047.734 2.023.802 dickens
10.117.960 9.966.096 mozilla
2.065.773 2.012.206 mr (SOSCB best)
969.798 933.615 nci
1.698.391 1.663.621 ooffice
2.081.744 2.049.413 osdb
830.294 815.091 reymont
2.799.600 2.773.724 samba
3.807.684 3.775.695 sao
5.292.616 5.201.770 webster
3.577.455 3.561.918 x-ray (SOSCB best)
281.484 277.915 xml
35.570.533 35.054.866 Total
0.1.1 -m2,3,0x03ededff develop.-mx Large Text Compression Benchmark
(>&b12)
24 24 ENWIK0
31 31 ENWIK1
91 88 ENWIK2
299 285 ENWIK3
2.997 2.947 ENWIK4
25.568 25.295 ENWIK5
214.137 211.863 ENWIK6
1.968.717 1.946.606 ENWIK7
18.153.319 17.928.046 ENWIK8
0.1.1 -m2,3,0x03ededff develop.-mx Other
194.417 192.249 book1
613.782 606.341 Calgary corpus.tar
331.931 327.840 Canterbury corpus.tar
0.1.1 -m2,3,0x00a968fd develop.-mx Compression Competition -- $15,000 USD
14.897.825 14.769.089 SRR062634.filt.fastq
2016/01/28
Version 0.1.1, new test on Silesia Open Source Compression Benchmark 1 2: precomp 0.4.4 -cn | cmv -mx (-mx is for comparison).
-mx .pcf | -mx Silesia Open Source Compression Benchmark
2.047.734 2.047.749 dickens
10.117.960 8.195.417 mozilla
2.065.773 2.065.797 mr
969.798 969.801 nci
1.698.391 1.698.456 ooffice
2.081.744 2.081.772 osdb
830.294 830.318 reymont
2.799.600 1.989.420 samba
3.807.684 3.807.687 sao
5.292.616 5.292.618 webster
3.577.455 3.577.474 x-ray
281.484 281.496 xml
35.570.533 32.838.005 Total
2016/01/26 (2016/10/27 Added Switches)
Version 0.1.1, new test on Maximum Compression benchmark: -max options (-mx is for comparison).
-mx -max Switches Maximum Compression
820.501 819.832 -m2,3,0x00ad69f5 A10.jpg
1.017.441 1.017.402 -m2,3,0x03ededff AcroRd32.exe
400.343 395.674 -m2,3,0x00a9fb7d english.dic
3.574.241 3.573.680 -m2,3,0x03edfc7d FlashMX.pdf
280.132 278.282 -m2,3,0x03eb7dff FP.LOG (-max -a15,M5)
1.387.079 1.386.634 -m1,3,0x03ede5fd MSO97.DLL
687.731 687.186 -m2,3,0x03ededfd ohs.doc
653.728 651.837 -m1,2,0x00a9f9bf rafale.bmp
399.793 399.784 -m2,3,0x03edfdbf vcfiu.hlp
372.831 372.551 -m2,3,0x03ededfd world95.txt
9.593.820 9.582.862 Total
2016/01/16 (2016/01/22 Added benchmark)
Cmv has quantized entropy of the previous few bytes in the contexts of 3 mixers (1+2+0) of the last stage mixers (6+3+1) since beginning.
Now I'm testing a model which have in the context a q.e. of the previous few bits and q.e. of the same bit position in the previous few bytes.
-m2,0,* +new model Maximum Compression
820.749 820.736 A10.jpg.cmv
1.019.353 1.018.527 AcroRd32.exe.cmv
403.026 402.533 english.dic.cmv
3.577.000 3.576.947 FlashMX.pdf.cmv
281.065 280.861 FP.LOG.cmv
1.391.254 1.390.140 MSO97.DLL.cmv
688.787 688.334 ohs.doc.cmv
653.804 651.149 rafale.bmp.cmv
400.620 400.145 vcfiu.hlp.cmv
374.695 374.560 world95.txt.cmv
9.610.353 9.603.932 Total
2016/01/14
Stuff made for version 0.2:
- Added mixer2 of 2 mixerN, one with fast and one with slow learning rate: mixer2(mixerN_fast_lr(), mixerN_slow_lr()).
Mixer will be ~2x slower and need ~2x memory.
Need to add a switch to disable/enable this mixer.
Some tests in http://encode.su/threads/2284-CMV?p=...ll=1#post46203
- During cmv development I tested some naive SSE, last month I implemented a serious SSE, but it's good only to double the input predictions of a mixer, not to tuning a mixer.
Linear SSE seems to be better than logistic one.
5 bit quantization are better, but 3 bit are quite good and they are 4x smaller so I choose 3 bit.
Mixer will be ~2x slower and need ~2x memory.
Need to add a switch to disable/enable this SSE.
SSE2 (2 input predictions + context) hurts compression ratio.
- Proof of concept: added 4*2 low order delta-like models.
Added 4 low order interleaved models and 1 BMP-oriented model (no detection of BMP ID).
Need to add a switch to disable/enable them.
-mx (1) -mx (2) -mx (3) Maximum Compression
820.501 820.891 820.160 A10.jpg
1.017.441 1.014.559 999.176 AcroRd32.exe
400.343 398.988 383.418 english.dic
3.574.241 3.575.464 3.567.760 FlashMX.pdf
280.132 279.555 267.099 FP.LOG
1.387.079 1.386.033 1.367.716 MSO97.DLL
687.731 688.414 683.408 ohs.doc
653.728 641.349 636.813 rafale.bmp
399.793 399.125 392.915 vcfiu.hlp
372.831 372.893 366.762 world95.txt
9.593.820 9.577.271 9.485.227 Total
-mx (1): 0.1.1
-mx (2): -mx (1) + delta-like + interleaved + BMP-oriented
-mx (3): -mx (2) + SSE + mix2(mixN_fast, mixN_slow), ~5720 Mb, speed ~2.8x slower than -mx (1)
2016/01/12 (2016/01/14 Added cmix)
Prime Number Benchmark http://encode.su/threads/2414-Prime-Number-Benchmark
msb lsb dbl hex text twice interlv delta bytemap bitmap total tarball tarball/total (total = 4803388; tarball = 4812800, 7z.exe a -ttar)
39547 39314 40403 41851 39019 39569 40831 34251 25789 32110 372684 292304 0,78432130169 cmv -mx
39379 39181 40011 41629 38829 39400 40548 33999 23851 31659 368486 290754 0,78905033027 cmv -max
37835 37973 38597 41629 38829 37894 36255 33999 23851 30518 357380 Best overall (2016/01/12)
nz nz nz cmv cmv nz nz cmv cmv zpaq
37835 37973 38597 41619 38829 37894 36255 33726 23851 29527 356106 Best overall (2016/01/14)
nz nz nz cmix cmv nz nz cmix cmv cmix
msb -m0,0,0x00abf9fd lsb -m1,1,0x00a96b7f dbl -m0,3,0x00ab69bd hex -m1,2,0x02a1e97f text -m1,3,0x00a1e57f
twice -m0,0,0x00abf9fd interlv -m1,3,0x01a8f86d delta -m2,3,0x03a809bd bytemap -m0,3,0x00aa096e bitmap -m0,0,0x00aaa8ef
tarball -m2,3,0x03ade97f
Word model is disabled 9/11.
Variable order and memory model is disabled 8/11.
Counter type is always sets to 0 (1 counter) (11/11).
2016/01/10 Start this experiment.
I don't want to take much time to verify my english in this post, sorry if it's bad.