BCM 0.11 on OSHO.TXT (206,908,949 bytes)
No code has to be inserted here.
BCM 0.11 on OSHO.TXT (206,908,949 bytes)
No code has to be inserted here.
I did some quick tests at metacompressor.com:
file16 binary exe binary executable 7,882,752 bytes:
comp: decomp:
761,368 bytes 9.837 sec. 9.762 sec. nanozip 0.08a 64bit nz -cc -m1750m
1,068,904 bytes 0.256 sec. 0.222 sec. bsc 2.2.0 64bit bsc e
1,097,550 bytes 1.626 sec. 1.448 sec. bcm 0.11 bcm
1,110,210 bytes 1.232 sec. 1.197 sec. bcm 0.10 bcm
1,146,156 bytes 0.247 sec. 0.164 sec. csc 3.2a5 csc32a5 -m2
1,148,585 bytes 0.351 sec. 0.165 sec. csc 3.2a5 csc32a5 -m3
1,263,193 bytes 0.216 sec. 0.167 sec. csc 3.2a5 csc32a5 -m1
1,617,113 bytes 0.046 sec. 0.029 sec. Etincelle RC2 etincelle -c
file29 image bmp image 2592x1944 pixels 15,116,598 bytes:
5,179,743 bytes 3.042 sec. 2.924 sec. bcm 0.11 bcm
5,214,043 bytes 2.575 sec. 2.602 sec. bcm 0.10 bcm
5,291,336 bytes 13.755 sec. 11.320 sec. nanozip 0.08a 64bit nz a -cc -m1750m
5,464,152 bytes 0.929 sec. 0.703 sec. bsc 2.2.0 64bit bsc e
7,374,670 bytes 2.024 sec. 0.560 sec. csc 3.2a5 csc32a5 -m3
7,430,910 bytes 1.360 sec. 0.575 sec. csc 3.2a5 csc32a5 -m2
7,459,324 bytes 1.078 sec. 0.578 sec. csc 3.2a5 csc32a5 -m1
8,886,413 bytes 0.193 sec. 0.122 sec. Etincelle RC2 etincelle -c
news text text 377,109 bytes:
92,603 bytes 1.046 sec. 1.017 sec. nanozip 0.08a 64bit nz a -cc -m1750m
110,644 bytes 0.037 sec. 0.020 sec. bsc 2.2.0 64bit bsc e
111,313 bytes 0.078 sec. 0.062 sec. bcm 0.11 bcm
111,567 bytes 0.065 sec. 0.050 sec. bcm 0.10 bcm
127,072 bytes 0.032 sec. 0.014 sec. csc 3.2a5 csc32a5 -m3
130,211 bytes 0.027 sec. 0.014 sec. csc 3.2a5 csc32a5 -m2
134,753 bytes 0.022 sec. 0.015 sec. csc 3.2a5 csc32a5 -m1
143,934 bytes 0.008 sec. 0.005 sec. Etincelle RC2 etincelle -c
pic binary binary 513,216 bytes:
40,543 bytes 0.990 sec. 1.171 sec. nanozip 0.08a 64bit nz a -cc -m1750m
44,949 bytes 0.082 sec. 0.060 sec. bcm 0.10 bcm
45,064 bytes 0.086 sec. 0.074 sec. bcm 0.11 bcm
47,081 bytes 0.050 sec. 0.017 sec. bsc 2.2.0 64bit bsc e
48,751 bytes 0.017 sec. 0.009 sec. csc 3.2a5 csc32a5 -m2
49,127 bytes 0.018 sec. 0.009 sec. csc 3.2a5 csc32a5 -m3
51,201 bytes 0.013 sec. 0.009 sec. csc 3.2a5 csc32a5 -m1
65,329 bytes 0.007 sec. 0.004 sec. Etincelle RC2 etincelle -c
trans text text 93,695 bytes:
13,693 bytes 5.329 sec. 4.304 sec. nanozip 0.08a 64bit nz a -cc -m1750m
16,907 bytes 0.022 sec. 0.010 sec. bsc 2.2.0 64bit bsc e
17,604 bytes 0.019 sec. 0.013 sec. bcm 0.10 bcm
17,668 bytes 0.023 sec. 0.017 sec. bcm 0.11 bcm
19,626 bytes 0.009 sec. 0.003 sec. csc 3.2a5 csc32a5 -m3
20,543 bytes 0.008 sec. 0.003 sec. csc 3.2a5 csc32a5 -m2
21,384 bytes 0.007 sec. 0.003 sec. csc 3.2a5 csc32a5 -m1
22,474 bytes 0.005 sec. 0.003 sec. Etincelle RC2 etincelle -c
A small BCM's Page & Mirror/Archive:
http://encode.narod.ru
Ilia, what about make small page with all your nice programs in one place (balz, quad, lzpm, ppmx, pim, pimple, tc...)?
Thanks Ilia!
Mirror: http://lovepimple.110mb.com/
Working on BCM v0.12... I dislike the fact that BCM is out of the board currently. I've found a few things that can be easily improved in current version - like a faster arithmetic encoder. The thing with CM is - we can add new models/modeling techniques to improve compression at the cost of processing speed and visa versa. Examples are BCM 0.09 - a high compression BWT+strong CM and BCM 0.10 - BWT+light CM. BCM can be notable faster than v0.10 and v0.11. Thinking about something new - a new CM structure - i.e. still not DC or QLFC. CM has a higher compression potential than DC or QLFC, at the cost of speed. My new fast CM is really fast and its speed is really close to the best DC and QLFC implementations. Still can't figure out about the best CM configuration and model set. What is good balance between good compression and fast compression/decompression speed? As example, how fast the ENWIK8 should be decompressed?
Of course there are things you can do to improve both compression and speed, like alphabet reordering, high order LZP preprocessing, and coding run lengths in your context model after BWT or use single bits to code repeats. Don't know if you already do this stuff. (BBB doesn't). Large memory models like BBB would help on enwik9. You could speed up forward BBB/BWT by replacing quicksort() + merge with libDivSufSort + merge or something.
> I've found a few things that can be easily improved in current
> version - like a faster arithmetic encoder.
Sure, there're several tricks which you could use... like
16-bit or 32-bit rc i/o, renorm skipping, branch removal,
keeping rc state in local vars etc.
But these don't really improve speed that much, and its
always harder to optimize decoding speed - its normally slower
than encoding, and less tricks apply.
> The thing with CM is - we can add new models/modeling techniques to
> improve compression at the cost of processing speed and visa versa.
Yes, but thats true for any BWT postcoders. Anyway, BWT results
are inherently limited, and for most users a difference between
210k and 220k on book1 doesn't really matter.
Well, I mean, DC or MTF or QLFC may hide some dependencies
(thus limiting the compression potential), but they still
have the same tradeoff options - its not like there's one
way to implement a QLFC postcoder.
> CM has a higher compression potential than DC or QLFC, at the cost
> of speed.
Yes, and I have doubts about speed too.
DC/MTF/QLFC all have to do that complex symbol ranking,
so it might be possible to design a simpler and faster direct
coder with similar compression.
> Still can't figure out about the best CM configuration and model
> set.
Mix2(o0,o1) with logistic mixing seems to show results near bsc's.
Then there're obvious improvements (counters->fsm and mix2->SSE2),
so overall I guess that should be about right.
> What is good balance between good compression and fast
> compression/decompression speed?
It only makes sense if you'd manage to make it faster than bsc,
and that's not so easy even if you'd use a plain o0 coder,
so I think it may be a bit early to look further atm?
> As example, how fast the ENWIK8 should be decompressed?
1. bsc is multithreaded so you'd need either to make a hacked
version for comparison (eg add your file i/o to its qlfc), or
consider making your coder MT as well.
2. getc/putc i/o is really problematic for a simple coder,
but bsc's pure blockwise approach isn't perfect either...
its hard to buffer enough blocks to smooth it out.
Actually, a direct CM may have benefits here, like how it
can encode the BWT symbols without having the whole BWT output in memory,
and write down the code right away.
2a. Note that bsc doesn't use much winapi, so things like VirtualAlloc
with MEM_LARGE_PAGES flag, windows threading API, i/o API (eg. overlapped or mapped i/o)
might allow to beat it.
3. You really need RLE and efficient symbol decomposition in postcoder.
4. In benchmarks, preprocessing (like LIPT and table reordering) would be more helpful
than anything you can do to the model.
2 Shelwien:
Strange thing about SSE2 on BWT data. In all of my tests O0+O1 (static mixing)+SSE always better than O0+O1+SSE2. Both SSEs have no linear/bilinear interpolation. Both have been optimized with my brute-force optimizer. Probably, I'm doing something wrong. SSE2 for sure properly initialazed - sse2(p0, p1) = (p0+p1)/2. Mixing original probability with SSE output usually helps (p+3ssep), what should we do with SSE2? Is interpolation a must have for SSE2?
Well, I actually talked about SSE2 with logistic inputs there... linear mixing didn't peform that well in mix_test experiments,
so SSE2 simulating linear mixing probably won't either. And yes, interpolation is very helpful there.
I didn't really try SSE2(o0,o1) on bwt data though, maybe I should.
As to your optimization btw, what kind of parameters are you talking about?
Something like shift counts? I mean, its better to have more precision at this point.
I have 17-bit counters. Parameters are counters' update rate, sse's quantization parameters, etc
Just for reference, CM part from BCM 0.11 compresses ENWIK8.BWT in 30 sec. i.e. new BCM 0.12 is about two times faster (~17-18 sec) and its compression is on par with BCM 0.11.