When I wrote an inverse BWT in ZPAQL, I stuffed an extra byte in the BWT for the EOF. Then you can eliminate the (i>=p) everywhere.
Yes
No
When I wrote an inverse BWT in ZPAQL, I stuffed an extra byte in the BWT for the EOF. Then you can eliminate the (i>=p) everywhere.
encode (27th February 2015)
AnthonyThrashD (5th March 2015),Bulat Ziganshin (2nd March 2015),Fu Siyuan (3rd March 2015),ivan2k2 (3rd March 2015),Mat Chartier (3rd March 2015),Matt Mahoney (4th March 2015),m^2 (5th March 2015),ne0n (3rd March 2015),nemequ (3rd March 2015),quadro (24th May 2015),spark (3rd March 2015),xezz (3rd March 2015)
http://mattmahoney.net/dc/silesia.html
http://mattmahoney.net/dc/text.html#1639
Sorry I couldn't test enwik9 with -b1000
encode (4th March 2015)
BTW, the 64-bit build can be slightly faster - I have to remove 32-bit related optimizations - will add a couple of #ifdef _WIN32 in next versions of BCM.Code:C:\Test>bcm -b100 -f enwik8 enwik8: 100000000 -> 20792796 in 12.766s C:\Test>bcm -d -f enwik8.bcm e8 enwik8.bcm: 20792796 -> 100000000 in 11.922s C:\Test>bcm -b1000 -f enwik9 enwik9: 1000000000 -> 164251284 in 147.229s C:\Test>bcm -d -f enwik9.bcm e9 enwik9.bcm: 164251284 -> 1000000000 in 142.072s![]()
Thanks. Updated http://mattmahoney.net/dc/text.html#1639
My webpage was broken yesterday due to uploading to my website not working. It seems to be fixed now.
It is very small improvement for encoding block size and pidx.
Last edited by xezz; 8th March 2015 at 15:05.
Thanks, but it's not an improvement - it may really hurt compression in some cases with tiny blocks - I tested this approach many times...
In addition, you are wrong here - p(0.5) for Encoder::Encode is 1<<17, not 32768. To not be mistaken by Counter's scale - the final probability scale is 1<<18 - p+ssep+ssep+ssep
Also, be aware of changing the output format - all public versions of BCM must be compatible now!
The thing I really would love to improve is inverse BWT - many compilers output really inefficient code - it is even slower than actual BWT transform. Only with PGO it becomes faster!
I'm sorry. corrected it.In addition, you are wrong here - p(0.5) for Encoder::Encode is 1<<17, not 32768.
The second and the very important purpose of such slightly redundant size/pidx coding is error detection - since BCM has no CRC or other checksum checking...
bcm-w64 -b1000 source archive
seems to be (under win server 2008 R2) faster and compress better as
bcm-w32 source archive
can this be true ?
i will run further tests
best regards
In many cases - the larger block size, the better. And yes, the 64-bit version must be faster. However, with the current inverse BTW, the 32-bit version is faster at decompression.
thank you for your answer.
PGO for further x64-optimization ..
Does it mean that we need a special intel-x64-version and another special AMD-x64-version?
best regards
I didn't look at the code, but in zpaq I have 2 versions of the inverse BWT, one for blocks up to 16 MB and one for blocks up to 4 GB. For the smaller blocks, I pack the linked list pointers and the character it points to into a single 32 bit variable. This speeds up the algorithm because there is only one cache miss per character instead of two. (Building the list like this only requires sequential memory access). For larger blocks you could have a 5 byte structure to do this, but since I was writing it in ZPAQL it wouldn't be easy to do.
I think I recall BSC has an inverse BWT that outputs 2 bytes at a time to reduce cache misses even further.
ICL's PGO (Profile Guided Optimization) didn't help with a 64-bit compile. Can't say about modern AMD CPUs - I don't have one.
I know about this trick - will test it more carefully. I didn't include it right away because the default block size is 20 MB (To keep the memory usage slightly over 100 MB)
running under win server 2008 R2 (virtual machine AMD processors)
bcm-w64 -b1000 e:backup.dat c:bac-bcm64
e:backup.dat: 54801567744 -> 5349330866 in 35905.593s
bcm-w32 e:backup.dat c:bac-bcm32
e:backup.dat: 54801567744 -> 6670331585 in 18204.265s
for now: bcm-w64 has a good potential!
bcm-w64 compress nearly 20% better then bcm-w32
but in my test
bcm-w64 takes nearly twice as much time as bcm-w32
if bcm-w64 can run so quick as bcm-w32
than it maybe can beat 7zip in efficiency
i will run further tests - best regards
As a note, you're comparing 20 MB block BWT with 1000 MB block BWT...
You may use -b2047 or even -b2097151k as well!![]()
A memo for myself
Code:BCM - A BWT-based file compressor, v1.01 Usage: bcm [options] infile [outfile] Options: -b#[k] Set block size to # MB or KB (default is 20 MB) -d Decompress -f Force overwrite of output file
Good that the source code is so small, so at least it build real fast and real nice results overall.
At the few tests I done, it seem not to use much above the 5n space, for a specified block size, which is also good point.
(Tested on a netbook: Aspire One Intel(R) Atom(TM) CPU N2600 @ 1.60GHz, 2 cores Ubuntu 64-bit )
./bcm -b100 -f enwik8 enwik8.bcm
enwik8: 100000000 -> 20792796 in 118.390s
C.Time: 116.62 s user, 1.78 s system, 1:59.84 total; Used 98% CPU, memory: 489404 kB
./bcm -d -f enwik8.bcm enwik8.dec
enwik8.bcm: 20792796 -> 100000000 in 94.042s
D.Time: 90.92 s user, 3.13 s system, 1:35.38 total; Used 98% CPU, memory: 489184 kB
Last edited by quadro; 24th May 2015 at 19:55.
BCM v1.01 @ GitHub:
https://github.com/encode84/bcm
![]()
Ilya,
BCM is a fantastic combination of speed, code simplicity and compression efficiency.
Thanks for sharing the code.
I adapted it to my project:
https://github.com/flanglet/kanzi/blob/master/java/src/kanzi/entropy/CMPredictor.java
https://github.com/flanglet/kanzi/bl...CMPredictor.go
java -cp kanzi.jar kanzi.app.BlockCompressor -input=enwik8 -output=none -block=100m -transform=bwt -entropy=cm -verbose=1
Encoding ...
Encoding: 26546 ms
Input size: 100000000
Output size: 20792267
Ratio: 0.20792268
Throughput (KB/s): 3678
go run BlockCompressor.go -input=enwik8 -output=none -block=100m -transform=bwt -entropy=cm -verbose=1
Encoding ...
Encoding: 47726 ms
Input size: 100000000
Output size: 20792267
Ratio: 0.207923
Throughput (KB/s): 2046
Last edited by hexagone; 10th April 2016 at 00:15. Reason: Formatting
encode (10th April 2016)
I see some code from BCM here... And I see no project contributors list...
Well, just a quick note on SLOW_RATE/FAST_RATE:
SLOW_RATE is 6 here, fast is 2.![]()
encode (10th April 2016)
Uh, I have searched for detailed README with algos description... You have from A to Z compression algos here!![]()
I am not claiming it is super detailed but you can start with the WIKI: https://github.com/flanglet/kanzi/wiki
encode (10th April 2016)
Updated to BCM v1.02. Sort of a final version that features long awaited optimizations!
https://github.com/encode84/bcm
Alexander Rhatushnyak (13th April 2016),comp1 (13th April 2016),Cyan (13th April 2016),Stephan Busch (13th April 2016)
A makefile or a toCompile.bat would be helpful.
Also, why not (k<<12)-(k==16) instead of (k-(k==16))<<12 in CM() ? Seems more logical, doesn't it?
This newsgroup is dedicated to image compression:
http://linkedin.com/groups/Image-Compression-3363256
encode (13th April 2016)
You are right! It's just the SSE initialization I'm using for years (if not decades)!
Collecting ideas for BCM v2.00...![]()
Should I release a slightly changed version of BCM, breaking compatibility (BCM is not that widespread I presume)?
Maybe with some extra features you suggest? (CRC32 checking? Higher compression at slower speed? ...?)
I believe you have no issue with "installed base".
But current BCM source version has a great property : simplicity.
This is a very worthy characteristic, which is worth more than little gains and features.
Now, of course, adding feature / performance while preserving simplicity, this would a great way forward.
Can't you just add the old uncompressor code to the new release? Or make something like SQX, capable to write 2 different versions of the standard, 1.1 and 2.0 I think.
Last edited by Gonzalo; 13th April 2016 at 23:16.