View Poll Results: Should I release BCM v2.00 (not comparible with v1.xx)?

Voters
29. You may not vote on this poll
  • Yes

    28 96.55%
  • No

    1 3.45%
Page 1 of 4 123 ... LastLast
Results 1 to 30 of 111

Thread: BCM - The ultimate BWT-based file compressor

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts

    Exclamation BCM - The ultimate BWT-based file compressor

    I proudly present a new version of BCM! It took 3 years, and now it's here, at last!

    Homepage:
    https://github.com/encode84/bcm

    It's 64-bit build only and, due to some reason, with no OpenMP

    For BWT it uses libdivsufsort-lite. New max block size is 2047 MB! CM part was completely rewritten and utilizes some brand new tricks and ideas. Thus, it's not just a stripped-down version like all releases after BCM v0.09!

    So anyway, please enjoy new release!

    LTCB (EDIT: These are outdated results)
    Code:
    >C:\bcm\x64\Release>bcm c enwik8 enwik8.z
    Compressing enwik8: 100000000->20736614 in 14.351s
    
    >C:\bcm\x64\Release>bcm d enwik8.z e8
    Decompressing enwik8.z: 20736614->100000000 in 13.231s
    
    >C:\bcm\x64\Release>bcm c1000 enwik9 enwik9.z
    Compressing enwik9: 1000000000->163885873 in 162.763s
    
    >C:\bcm\x64\Release>bcm d enwik9.z e9
    Decompressing enwik9.z: 163885873->1000000000 in 153.607s
    System specs:
    CPU: Intel Core i7-3770K @ 4.8 GHz
    Memory: Corsair Vengeance LP 16 GB @ 1800 MHz CL9
    SSD: Corsair Force GS 240 GB
    OS: Microsoft Windows 7 SP1

  2. The Following 11 Users Say Thank You to encode For This Useful Post:

    137ben (28th May 2016),Cyan (9th May 2016),Euph0ria (14th May 2016),Fallon (3rd July 2013),Matt Mahoney (23rd June 2013),Nania Francesco (25th June 2013),PerlAmutor (23rd June 2013),samsat1024 (29th June 2013),Stephan Busch (23rd June 2013),svpv (19th November 2018),VadimV (23rd June 2013)

  3. #2
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Hi Ilya

    Very good job !

    The partial results are really interesting ...
    Last edited by Nania Francesco; 23rd June 2013 at 00:29.

  4. The Following User Says Thank You to Nania Francesco For This Useful Post:

    encode (23rd June 2013)

  5. #3
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Finally beat bsc and bbb
    http://mattmahoney.net/dc/text.html#1639

  6. #4
    Member
    Join Date
    Sep 2011
    Location
    uk
    Posts
    237
    Thanks
    187
    Thanked 16 Times in 11 Posts
    Ilya, thanks for new version - however I'm disappointed & can't understand why there isn't a 32 bit o/s version (and I guess I'm not alone) - there are still a lot of eg xps out running on quite fast multicore processors there and it'd be interesting to compare with previous bcms on them eg on relatively small files eg in the 100s of Mbytes or few gbytes rather than 10s or 100s of Gbytes.

    Of course most of the 32 bit o/s run on processors with 64 bit instructions so that isn't an issue. Is is that it won't work on 32 bit and you have only tested on 64 bit o/s? If so you can say 'at your own risk or no support on 32 bit o/ss'.

    John

  7. #5
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    unfortunatley, it's impossible to run 64-bit programs on 32-bit OS

  8. #6
    Member
    Join Date
    Sep 2011
    Location
    uk
    Posts
    237
    Thanks
    187
    Thanked 16 Times in 11 Posts
    Bulat - everyone knows that! You didn't read my email.

    Bearing in mind that previous versions run on 32 bit, I asked for a 32 bit version of the program assuming that it's a recompilation with only few or minor modifications!

    J

  9. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    OK, here it is:
    http://encode.narod.ru/bcm014-WIN32.zip

    This compile is not for benchmarks(!), just for personal evaluation - for those who have no access to a PC under 64-bit Windows! Enjoy anyway!

  10. #8
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Hi Ilya,
    can you anticipate your plans or the roadmap for BCM?

  11. #9
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    Quote Originally Posted by encode View Post
    Well, I've just finished working on a new release. Will test it for a few days to check its performance and double-check each line of a code - since I do not plan to release a next version anytime soon.
    i.e. no roadmap for BCM. Currently I'm working on CRUSH, the next thing is my image compressor BIM!

  12. #10
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @encode: congratulation to your new improved version 1.4 - wonderful

    @Matt Mahoney

    you wrote on your website -- http://mattmahoney.net/dc/text.html#1640
    ---
    bsc 3.1.0, July 9, 2012, is a test version, not compatible with v3.0.0. Source was not released.
    ---

    YES - VERY BAD - v3.1.0 is not compatible with v3.0.0

    BUT - "Source was not released" - that seems not to be correct


    Ilya Grebnov has changed the License from GPL to Apache Version 2.0


    there is a new website:

    http://libbsc.com/

    and

    https://github.com/IlyaGrebnov/libbsc

    and the missing sourcecode seems to be here:

    https://s3.amazonaws.com/libbsc/bsc-3.1.0-src.zip

    can you please proove this and correct it , if possible

    PS: the source-code from v3.0.0 is GPL ...

    best regards

  13. #11
    Member
    Join Date
    Apr 2010
    Location
    CZ
    Posts
    81
    Thanks
    5
    Thanked 7 Times in 5 Posts
    Cool, I think the enwik9 results is pretty good, I will test it more

  14. #12
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Thanks. I obviously missed the followup to the original post. http://encode.su/threads/586-bsc-new...ll=1#post29904

  15. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    I have an idea to release the BCM as an open source (public domain) file compressor. It might be sort of bzip2 replacement or the part of PeaZip, as example.

    The main difference from experimental releases of BCM:
    • It has simplified CM back-end - to be faster, to have a smaller code and do not reveal some CM tricks of my own (like FSM stuff and such)
    • It uses libdivsufsort-lite for BWT stage
    • Designed to be okay on 32-bit machines - 32-bit build is the main build. Of course, 64-bit code will be even faster

    No source at this stage, please test an executable and tell me what you think!


  16. The Following 3 Users Say Thank You to encode For This Useful Post:

    ivan2k2 (24th February 2015),Nania Francesco (23rd February 2015),Stephan Busch (23rd February 2015)

  17. #14
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    873
    Thanks
    460
    Thanked 175 Times in 85 Posts
    BCM 0.15 cannot be used with 1000MB blocksize (like BCM 0.14 c1000)
    because I get a 'out of memory' message. This increases filesizes.
    Is there a way go get around this? - Maybe it is because the above is a 32-bit build?

  18. #15
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    It is right, 32-bit build is limited to about 406 MB blocks. (Memory usage is 5N - i.e. for 1000 MB blocks you will need about 5000 MB of free memory - that is far beyond 2 or 3 GB limit of 32-bit executables)
    Final release will have both 32 and 64-bit (as well as Linux) builds
    For now it is more important to test the BCM as a regular compressor - without such extreme options!
    To see will 32-bit code meet or not the expected performance...

  19. #16
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    873
    Thanks
    460
    Thanked 175 Times in 85 Posts
    I see

  20. #17
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @encode: congratulation to your new improved version 1.5

    i am glad to see that the project is alive.

    "release the BCM as an open source (public domain) file compressor"

    - wonderful


    1. it compress better then version 1.4 (in my test 3% better)
    2. it compress faster then version 1.4 (in my test 10% faster)
    3. compression/decompression was correct in my tests

    it works well.

    wishes:

    - implementation within a archiv file format like 7z
    - want to see an implementation of a algorithm like ST5 or ST6 as inside bsc 3.0.0 (GPL)

    best regards

  21. The Following User Says Thank You to joerg For This Useful Post:

    encode (24th February 2015)

  22. #18
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts

    int* next=(int*)&buf[b];

    OR

    int* next=(int*)(buf+b);

    What do you think guys?

    Also:

    int n;
    while ((n=fread(buf, 1, b, in))>0)
    {
    const int p=divbwt(buf, buf, (int*)(buf+n), n); // (int*)(buf+b) works fine here
    if (p<1)
    {
    perror("Divbwt failed");
    exit(1);
    }
    // ...

    if n is less than b - this code may cause "Segmentation fault" under Linux. Why is that?

  23. #19
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    BCM v0.15-BETA - a slightly re-tuned version. Most likely, this is the final version. However, this version will be incompatible with an open source BCM - since it has a BETA file signature.



    Win32/Win64/Ubuntu64 binaries are here:

  24. #20
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    452
    Thanks
    141
    Thanked 155 Times in 103 Posts
    Quote Originally Posted by encode View Post

    int* next=(int*)&buf[b];

    OR

    int* next=(int*)(buf+b);

    What do you think guys?
    Personally I've always preferred the former as the explict & reminds me I'm dealing with the address, but really it's a minor thing. Of course I hope it's word aligned.
    I've been caught out by this before on x86 platforms. While it can handle unaligned access, in a type loop the compiler can (and sometimes WILL) aggressively vectorise your code and start to use SIMD instructions, which cannot handle unaligned access. I had to put some pragmas around a function to prevent newer gcc from doing that once.

    Also:

    int n;
    while ((n=fread(buf, 1, b, in))>0)
    {
    const int p=divbwt(buf, buf, (int*)(buf+n), n); // (int*)(buf+b) works fine here
    if (p<1)
    {
    perror("Divbwt failed");
    exit(1);
    }
    // ...

    if n is less than b - this code may cause "Segmentation fault" under Linux. Why is that?
    I haven't looked into what divbwt does, but I assume it is reading from buf for n bytes and writing back to buf+n?

    Maybe it comes down to the unaligned memory accesses I mentioned above. If it is highly optimised SIMD code then the pointer you pass in may need to be something like buf+((n+3)&~3) to round it up to the next word boundary.

  25. The Following User Says Thank You to JamesB For This Useful Post:

    encode (26th February 2015)

  26. #21
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    on haswell, unaligned access is free and afair, evex encoded (avx) instructions allow to use unalighned memory operands

  27. #22
    Member
    Join Date
    Jun 2013
    Location
    Sweden
    Posts
    150
    Thanks
    9
    Thanked 25 Times in 23 Posts
    Quote Originally Posted by encode View Post
    BCM v0.15-BETA
    DVDMOVIE.[CDC4673E].iso 7.293.587.456 ->

    6.508.074.722 in 2266.84s (old bcm64 c2047)
    6.510.031.431 in 1960.38s (new bcm64 -b2047)
    6.617.791.064 in 1439.17s (new bcm64 -b192)
    6.617.791.064 in 1496.18s (new bcm32 -b192)
    6.632.075.210 in 418.832s (bsc310 -e2b192 -rsP)
    6.632.075.210 in 1936.784s (bsc310 -e2b192 -rsPTt)
    6.966.570.213 (7-Zip [64] 9.20 -mx -mmt=3 -m0=lzma2:d30) Elapsed: 0:41:49,45
    6.978.932.293 (7-Zip [64] 9.20 -mx -mmt=5 -m0=lzma2:d30) Elapsed: 0:16:36,39
    6.989.546.437 (7-Zip [64] 9.20 -mx -mmt=8 -m0=lzma2:d192m) Elapsed: 0:07:52,85

  28. The Following User Says Thank You to a902cd23 For This Useful Post:

    encode (26th February 2015)

  29. #23
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by encode View Post

    int n;
    while ((n=fread(buf, 1, b, in))>0)
    {
    const int p=divbwt(buf, buf, (int*)(buf+n), n); // (int*)(buf+b) works fine here
    if (p<1)
    {
    perror("Divbwt failed");
    exit(1);
    }
    // ...

    if n is less than b - this code may cause "Segmentation fault" under Linux. Why is that?
    ASM?
    Larger piece of code?
    If I were to guess, some misoptimisation caused by pointer aliasing, if you're using gcc or clang you may try -fno-strict-aliasing.

  30. #24
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    Everything works just fine under Windows (Visual C++). But with g++ under Linux...
    This loop processes file by blocks. Forgot to mention, everything works fine until the last, smaller block.

    Anyway, since (int*)&buf[b] (or (int*)(buf+b) ) works fine...

  31. #25
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    Might be it's a Linux-related bug - I'm running Ubuntu LiveCD (OS loaded to RAM only). Might be it's due to some weird memory allocation and/or access - but today I was unable to reproduce the bug...

  32. #26
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    452
    Thanks
    141
    Thanked 155 Times in 103 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    on haswell, unaligned access is free and afair, evex encoded (avx) instructions allow to use unalighned memory operands
    It's certainly not true for all Intel architectures with all SSE instructions though, as I did indeed have crashes caused by this. I had code that I optimised like this:

    Code:
                        unsigned char *cp = fp->uncomp_p;
                        i = 0;
    #ifdef ALLOW_UAC
                        int n = b->len & ~3;
                        for (; i < n; i+=4) {
                            //*cp++ = *dat++ + '!';
                            *(uint32_t *)cp = *(uint32_t *)dat + 0x21212121;
                            cp  += 4;
                            dat += 4;
                        }
    #endif
                        for (; i < b->len; i++) {
                            *cp++ = *dat++ + '!';
                        }
    ALLOW_UAC being a #define on architectures that permit unaligned memory accesses. Basically it's adding '!' to a quality string to turn it from raw binary to printable ascii (FASTQ format conversion), and trying to do the copy + '!' 32-bits at a time. It crashed here as cp and/or dat isn't word aligned.

    Code:
       0x426a43 <bam_put_seq+3219>: and    $0x18,%al
       0x426a45 <bam_put_seq+3221>: add    $0x1,%r12d
       0x426a49 <bam_put_seq+3225>: mov    0x18(%rsp),%rax
       0x426a4e <bam_put_seq+3230>: lea    0x0(,%r12,4),%r10d
       0x426a56 <bam_put_seq+3238>: lea    -0x28(%rbx,%r8,1),%r15
       0x426a5b <bam_put_seq+3243>: xor    %r8d,%r8d
    => 0x426a5e <bam_put_seq+3246>: movdqa (%rax,%r8,1),%xmm1
       0x426a64 <bam_put_seq+3252>: add    $0x1,%ebp
       0x426a67 <bam_put_seq+3255>: paddd  %xmm0,%xmm1
       0x426a6b <bam_put_seq+3259>: movups %xmm1,(%r15,%r8,1)
       0x426a70 <bam_put_seq+3264>: add    $0x10,%r8
       0x426a74 <bam_put_seq+3268>: cmp    %r12d,%ebp
       0x426a77 <bam_put_seq+3271>: jb     0x426a5e <bam_put_seq+3246>
    This was generated by gcc 4.9.2 -O3. With -O2 it doesn't do this. I resolved it hackily with a __attribute__((optimize("no-tree-vectorize"))) statement before the function, but it's not particularly ideal! (The alternative is making the loop counter volatile, but that's an even more bizarre workaround.)

    Anyway, try compiling with -O2 to see if the problem goes away.

  33. #27
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    as i said, it's true for evex-encoded instructions that supported starting from sandy bridge. older archictectures require to use uinaligned load as separate instructions in order to access unaligned data. those unaligned loads are free (compared to aligned ones) at least on haswell
    Last edited by Bulat Ziganshin; 27th February 2015 at 15:40.

  34. #28
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    BTW, in your opinion, what is the best approach for BWT's EOF encoding?

    Currently I use:

    // Inverse BWT
    int t[257]={0};
    for (int i=0; i<n; ++i)
    ++t[buf[i]+1];
    for (int i=1; i<256; ++i) // Skip t[256]
    t[i]+=t[i-1];

    int* next=(int*)&buf[b];
    for (int i=0; i<n; ++i)
    next[t[buf[i]]++]=i+(i>=p); // Two loops with =i and =i+1 make no speed improvement (Visual C++ compiler)

    for (int i=p; i!=0;) // i=0 is the last byte in block
    {
    i=next[i-1];
    putc(buf[i-(i>=p)], out);
    }


  35. #29
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    putc should be slow due to locking. it's better to write data in ~1mb blocks

  36. #30
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,964
    Thanks
    367
    Thanked 341 Times in 134 Posts
    putc() should be a macro, and with _CRT_DISABLE_PERFCRIT_LOCKS Visual C++ must use *_no_lock() functions.
    Anyway, I/O optimizations can be added later... For now, I'm looking for the best BWT output format!

Page 1 of 4 123 ... LastLast

Similar Threads

  1. BCM v0.09 - The ultimate BWT-based file compressor!
    By encode in forum Data Compression
    Replies: 22
    Last Post: 6th March 2016, 09:26
  2. BCM 0.11 - A high performance BWT compressor
    By encode in forum Data Compression
    Replies: 44
    Last Post: 29th October 2010, 22:45
  3. BCM v0.08 - The ultimate BWT-based file compressor!
    By encode in forum Data Compression
    Replies: 78
    Last Post: 12th August 2009, 10:14
  4. BCM v0.01 - New BWT+CM-based compressor
    By encode in forum Data Compression
    Replies: 81
    Last Post: 9th February 2009, 15:47
  5. Blizzard - Fast BWT file compressor!!!
    By LovePimple in forum Data Compression
    Replies: 40
    Last Post: 6th July 2008, 14:48

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •