I modifed Shelwien code to use first Yuta's BWT and BWTS I made MinGW executables. Its not clean just a quick test but it does seem to work
I modifed Shelwien code to use first Yuta's BWT and BWTS I made MinGW executables. Its not clean just a quick test but it does seem to work
Last edited by biject.bwts; 8th December 2009 at 07:45.
Thanks Shelwien for letting me test the openbwt v1.4 bwt and
bwts routines with BWTMix
enwik8 original
20,608,793
enwik8 bwt form openbwt version 1.4
20,608,794
enwik8 bwts from openbwt version 1.4
20,608,800
I compressed uncompressed and then compared
all the same
all 3 almost compress the same. I tested several smaller files
where the order of which is smaller changed. It seems for small
files the BWTS version is slightly better by a byte or 2
I still will write some full bijective BWTS compressors but
this was a needed break.
Why don't you also measure the encoding/decoding times for all 3 versions?
openbwt ones might easily be faster... or not, depending on your compiler :)
That would be a good idea however I need to cleanup what I did
and recompile them with the same compiler to make the comparision
more meaningful. But eyeballing the message as they came to my
screen I would say the BWT or BWTS part are not the major parts
of program that consumed the time.
I will add code to print a relative time stamp before and after the
major parts. if you have any preferred code you use to do that if
only a few lines send it to me and I will use it.
Ok changed BWTMIX to BMIX your stuff
BMIX1 yuta bwt
BMX2 yuta bwts
using ENWIK8 to compress and decompress
BMI 680 seconds
BMI1 517 seconds
BMI2 627 seconds
The actual times are in files BMI*.DAT
the BMI*.cpp are the main codes with the timing codes
It varied each time I ran it so compile and try on your
on
it's in the file BMI.zip
created 3 verstions of BWTmix one is the old with my new MinGW complier it was changed to add timing info
called it bmix
replaced the BWT routines with BWTS routines and made UNBWTS 5n for storage
called it bmixbwts
did a reverse on read buffers then swaped ouput since fast to do reverse file on output.
called it bmixbwtsr
created three executables I did not test but compiled for athlon-xp
in case Sami want to look at it. He rejected the 6n version that was
in UnBWTS so created a UNBWTSS that is 5n period.
bmix_xp
bmixbwts_xp
bmixbwtsr_xp
test group one with c125 option example bmix c125 enwik8 x.x
test groupt two like above but c1250 example bmix c1250 enwik8 x.x
** size **** TT **** CT **** DT ** group one c125
22,846,832 360.160 233.874 126.286 bmix
22,846,810 375.301 235.165 140.136 bmixbwts
22,956,704 307.701 187.730 119.971 bmixbwtsr
** size **** TT **** CT **** DT ** group two c1250
20,608,793 424.494 291.226 133.268 bmix
20,608,800 366.957 216.389 150.568 bmixbwts
20,672,198 344.712 220.152 124.560 bmixbwtsr
I only ran the test once. But for this kind off data reversing the file seems
like a bad idea
Use at your own risk it worked on mine
it may not work on yours
As you know I redid the UNBTS in Yata's version 1.4 to make
large files decompress in 5n memory instead of 6n
Forget the timing but the BWTS version faster on my machine than
Sami's BWT version on his machine by a factor of 2. but that's apples and oranges.
ENWIK9
BWTmix c1250 178,510,043 Sami's test
bmixbwts c1250 178,509,848 My machine
this is only for ENWIK9 for some files the BWT version
the best for others the BWTS version is the best
but in this standard test file dropping in BWTS for
BWT seems to do best. But the two are very close.
also did fc/b which showed the decompressed file
matched ENWIK9
Took Shelwien suggestion tried c3334 option failed discovered
I allocated and array that I dropped when going to 5n memory
so made new executables dropping the array.
bmixbwts c3334 for file ENWIK9
170,596,577
checked forward an back it was ok
uploaded changed files.