Page 2 of 13 FirstFirst 123412 ... LastLast
Results 31 to 60 of 364

Thread: bsc, new block sorting compressor

  1. #31
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,552
    Thanks
    767
    Thanked 684 Times in 370 Posts
    probably this xeon is hyper-threaded single-core

  2. #32
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    412
    Thanks
    38
    Thanked 64 Times in 38 Posts
    @bulat

    thank you for the hint
    it seems you are right.

    1 processor = 1 core with 2 threads
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	cpu.jpg 
Views:	517 
Size:	132.7 KB 
ID:	1261  

  3. #33
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,552
    Thanks
    767
    Thanked 684 Times in 370 Posts
    made a few tests using natural language text (english.100MB) and linux kernel sources

    1. sorting:
    • btw and st4 sorting has the same speed as in grzip
    • bsc supports huge dictionaries, and speed almost not degrades, meaning that we can improve compression by 5-10% for free
    • new st5 mode is amazing! 2x faster that bwt with only 2% less compression
    2. encoding
    • with bwt, qlfc provides 1.5% better compression and 10% less speed compared to grzip -m1
    • st4+mtf+huf is still speed king
    3. filters
    • -H24 or so slightly increases compression level
    • -M32 makes compression faster and, for ST, even better
    • other filters just waste time on my tests, so i've disabled them
    on english text, lzp+bwt+qlfc with maximum dictionary provides the same compression level as dict+lzp+ppmd while compressind and extracting 30% faster (single-threaded performance). on linux sources, compression ratio is 4% worse while speed is still 30% better. one problem, though, is that ppmd effectively splits all data into 7-8 mb blocks so it can be easily multithreaded, at least for extraction. bsc with maximum block size is single threaded

    so, i recommend the following settings:
    fast: grzip -m4
    optimum: -m2 -b25 -M32 [-r -s]
    best: -m3 -b1024 -H28 -M32

  4. #34
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,552
    Thanks
    767
    Thanked 684 Times in 370 Posts
    suggestions:
    1. give us back mtf encoder!
    2. is it possible to multi-thread single-block compression? in particular, split sorting work into several independent buckets, overlap sorting and encoding, make encoding block a 1/4 of sorted block or encode in 1 mb sub-blocks. btw it seems that bsc -t -b1024 demonstrates some internal multithreading while -T works single-threaded
    3. increase default block size to 25 mb or so
    4. is it possible to make other filters faster, at least on detection stage? otherwise -r -s makes text compression much faster w/o losing ratio
    5. add exe filter, preferably dispack with obj/exe/data detector
    6. bssc provides great speed/compression for executables by using 22-bit ST transform. you may try...
    7. there is an idea that DICT can provide better boost for BWT compressors that operate on 16-bit words..

    non-library code:
    1. allow to use stdin/stdout/nul, count bytes read/written in i/o procedures instead of using ftell
    2. mimic bzip2 syntax, port program to unix and advertise it as possible bzip2 successor
    3. ST stands for Shindler transform

  5. #35
    Member
    Join Date
    Apr 2010
    Location
    Czech rep.
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    BSC vs nanozip

    I tested BSC on mix of TXT,Word-DOC files... Dual core CPU...

    BSC (setup block=64MB) is 8,5x(!) faster than Nanozip (memory=400MB, nz_cm mode), and penalty is only +7% in size.

    Unbelievable...

  6. #36
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    Quote Originally Posted by Bulat Ziganshin View Post
    suggestions:
    1. give us back mtf encoder!
    I am working on this. Probably I will add fast mode in next release.

    Quote Originally Posted by Bulat Ziganshin View Post
    2. is it possible to multi-thread single-block compression? in particular, split sorting work into several independent buckets, overlap sorting and encoding, make encoding block a 1/4 of sorted block or encode in 1 mb sub-blocks. btw it seems that bsc -t -b1024 demonstrates some internal multithreading while -T works single-threaded
    BWT - Already have some optimizations.
    ST4/ST5/LZP - Not possible with current implementation.
    QLFC - Splitting of ST/BWT output in multiple blocks and process them in parallel is very good idea. Probably I will add this feature in next release.

    Quote Originally Posted by Bulat Ziganshin View Post
    3. increase default block size to 25 mb or so
    Sound good. I will add this in next release.

    Quote Originally Posted by Bulat Ziganshin View Post
    4. is it possible to make other filters faster, at least on detection stage? otherwise -r -s makes text compression much faster w/o losing ratio
    Unfortunately no. You can have a looks in source code Segmentation(slowest part) is based on work of Dmitry Shkarin. And I make segmentation work as fast as possible.

    Quote Originally Posted by Bulat Ziganshin View Post
    5. add exe filter, preferably dispack with obj/exe/data detector
    I don't know. May be better to add bsc to FreeARC, instead of adding filters to bsc itself?

    Quote Originally Posted by Bulat Ziganshin View Post
    6. bssc provides great speed/compression for executables by using 22-bit ST transform. you may try...
    Sound good. I will add ST3 in next release.

    Quote Originally Posted by Bulat Ziganshin View Post
    7. there is an idea that DICT can provide better boost for BWT compressors that operate on 16-bit words..
    Same as 5.

    Quote Originally Posted by Bulat Ziganshin View Post
    non-library code:
    1. allow to use stdin/stdout/nul, count bytes read/written in i/o procedures instead of using ftell
    2. mimic bzip2 syntax, port program to unix and advertise it as possible bzip2 successor
    My goal is a data compression library. So I hope that someone can help me to build a nice compressor around it.

    Quote Originally Posted by Bulat Ziganshin View Post
    3. ST stands for Shindler transform
    I know
    Enjoy coding, enjoy life!

  7. #37
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    Quote Originally Posted by Bulat Ziganshin View Post
    [*]-M32 makes compression faster and, for ST, even better
    I some cases preceding contexts provide better compression for ST.
    Enjoy coding, enjoy life!

  8. #38
    Member mahessel's Avatar
    Join Date
    Apr 2010
    Location
    Netherlands
    Posts
    9
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Source?

    Is the source code available somewhere?

  9. #39
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    Quote Originally Posted by mahessel View Post
    Is the source code available somewhere?
    Source code is not available yet, but will be available soon(within two weeks).
    Enjoy coding, enjoy life!

  10. #40
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    222
    Thanked 146 Times in 83 Posts

    Only for Help!

    MOC Benchmark Failed with -m1 -m2 -m3 options !
    the file is :
    http://hea-www.harvard.edu/~fine/ima...ion-Nebula.jpg

  11. #41
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    Quote Originally Posted by Nania Francesco View Post
    MOC Benchmark Failed with -m1 -m2 -m3 options !
    the file is :
    http://hea-www.harvard.edu/~fine/ima...ion-Nebula.jpg
    Thanks for reporting this issue. Issue will be fixed in next version. Current workaround is to disable preprocessing "-p".
    Enjoy coding, enjoy life!

  12. #42
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    222
    Thanked 146 Times in 83 Posts

    Hi !

    MOC Benchmark passed with options:
    -m1 -p
    -m2 -p
    -m3 -p
    All is OK!
    Hi!

  13. #43
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    Block Sorting Compressor version 2.0.0 is available for testing:
    [+]: Bug fixes.
    [+]: Order 3 sort transform.
    [+]: Better multi-core systems support. Even if you compressing file as single block.
    [+]: Fast "-f " compression mode. Work best with ST4&ST5 transforms.
    [+]: Source code is available.

    Source code and binaries for Windows available on my website http://libbsc.com/default.aspx
    Enjoy coding, enjoy life!

  14. #44
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    enwik9 tests for new 2.0.0 version:

    1. enwik9 was compressed to 163888465 bytes in 237.193 seconds and decompressed in 199.031 seconds on Intel Core 2 Quad Q9400, 8GB RAM, Win7 x64 using 5095Mb RAM with "-b1000p" command line option.
    2. enwik9 was compressed to 201321919 bytes in 25.834 seconds and decompressed in 64.413 seconds on Intel Core 2 Quad Q9400, 8GB RAM, Win7 x64 using 1624Mb RAM "-b80p -m2f" command line option.
    Enjoy coding, enjoy life!

  15. #45
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    Athlon 64 X2 4200+, 2GB RAM, Win7 x86. bsc103 = BSC 1.0.3, bsc200 = BSC 2.0.0.

    Enwik8, default settings:
    C:\Users\Black_Fox\Desktop\bsc-x86>bsc103 e enwik8 enw103
    enwik8 compressed 100000000 into 23528564 in 29.046 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc103 d enw103 enw103_
    enw103 decompressed 23528564 into 100000000 in 15.506 seconds.


    C:\Users\Black_Fox\Desktop\bsc-x86>bsc200 e enwik8 enw200
    enwik8 compressed 100000000 into 22251068 in 29.105 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc200 d enw200 enw200_
    enw200 decompressed 22251068 into 100000000 in 16.230 seconds.


    Enwik8, -b128:
    C:\Users\Black_Fox\Desktop\bsc-x86>bsc103 e enwik8 enw103 -b128
    enwik8 compressed 100000000 into 20769950 in 58.665 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc103 d enw103 enw103_
    enw103 decompressed 20769950 into 100000000 in 30.342 seconds.


    C:\Users\Black_Fox\Desktop\bsc-x86>bsc200 e enwik8 enw200 -b128
    enwik8 compressed 100000000 into 20771832 in 42.290 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc200 d enw200 enw200_
    enw200 decompressed 20771832 into 100000000 in 26.362 seconds.


    avatarized.psd, default settings:
    C:\Users\Black_Fox\Desktop\bsc-x86>bsc103 e avatarized.psd ava103
    avatarized.psd compressed 120641788 into 60955976 in 48.686 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc103 d ava103 ava103_
    ava103 decompressed 60955976 into 120641788 in 32.064 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc200 e avatarized.psd ava200
    avatarized.psd compressed 120641788 into 49399442 in 47.870 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc200 d ava200 ava200_
    ava200 decompressed 49399442 into 120641788 in 33.772 seconds.


    avatarized.psd, -b128:
    C:\Users\Black_Fox\Desktop\bsc-x86>bsc103 e avatarized.psd ava103 -b128
    avatarized.psd compressed 120641788 into 16361477 in 58.388 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc103 d ava103 ava103_
    ava103 decompressed 16361477 into 120641788 in 33.971 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc200 e avatarized.psd ava200 -b128
    avatarized.psd compressed 120641788 into 16239295 in 42.077 seconds.

    C:\Users\Black_Fox\Desktop\bsc-x86>bsc200 d ava200 ava200_
    ava200 decompressed 16239295 into 120641788 in 31.125 seconds.

    Some other compressors' performance at avatarized.psd:
    ZIP, Best (via WinRAR): 94272788
    7z, Ultra 64MB: 14560884
    FreeArc -mex5: 16116052

    Note that avatarized.psd is first large compressible binary file I stumbled upon during tests
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  16. #46
    Member evg's Avatar
    Join Date
    May 2009
    Location
    Austria
    Posts
    23
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Post linux builds

    hello,

    for those interested you will find attached builds for linux x86/x64 and the crude buildscripts. compiled with very conservative switches, quick tests were ok.

    best regards
    evg
    Attached Files Attached Files

  17. #47
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,552
    Thanks
    767
    Thanked 684 Times in 370 Posts
    new -m2 -f mode is truly amazing - it provides quite the same compression as grzip -m2 while being 2x faster! with the 30 mb/sec compression speed on my core2quad@3.25GHz it's the best candidate for freearc -m3 text compression. btw, rar -m5 provides almost the same ratio on enwik8 while being 12x slower!!!

    what about faster and slower methods? let's see - st4/st5 transformation by itself needs only 1.5-2 secs (for enwik8 on my cpu). if you can build some weak encoding that uses just another 1.5-2 seconds, resulting speed on quad-cpu will be about 100 mb/sec

    on the other direction, i.e. good compression with ~10mb/sec speed on my quad-cpu, the only way i see is to parallelize bwt itself. another alternative is to detect number of cores and size of file and split it accordingly but it means weaker compression on small files that is not so interesting for strong, bwt-based method


    7. there is an idea that DICT can provide better boost for BWT compressors that operate on 16-bit words.
    it requires support of 16-bit elements in bwt/st/coders (also lzp and so on). well, it's a big chunk of work without guaranteed results


    btw, you can find some ideas for lzp in freearc lzp sources. i got Shkarin's algorithm and even improved it a bit. in particular, my code doesn't make random memory access on every decompression step, so decompression with large hash is 2x faster than in your/Dmitry code. also, you may try multiplying hash:

    hash(x) = (x*123456791) >> (32-n)

    probably, it has the same speed as your code but a bit better distruibution

  18. #48
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,552
    Thanks
    767
    Thanked 684 Times in 370 Posts
    i think that cycle in bsc_unbwt_reconstruct may be multithreaded if several initial indexes are saved with bwt block. for example, we may save index after every 1 mb of bwt input data, so this cycle may be run independent for every 1mb unbwt output blocks. it should allow -m3 decompression to reach 90% utilization on multi-core cpus

    cycle in bsc_unbwt_sort filling P[] also may be multithreaded but it will require to store bucket[] for every sub-block counted independently. this stats may be used to improve post-bwt compression

    btw, how fast is slow unbwt (second part in bsc_unbwt_reconstruct)? is it the same as -s algorithm in bzip2? it would be useful to have unbwt with smaller memreqs as bzip2 -s decompression

  19. #49
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    Quote Originally Posted by Bulat Ziganshin View Post
    btw, how fast is slow unbwt (second part in bsc_unbwt_reconstruct)? is it the same as -s algorithm in bzip2? it would be useful to have unbwt with smaller memreqs as bzip2 -s decompression
    No, this is not the same like in bzip2. I have two version of bwt reconstruction algorithms for different cases, but all of them have the same memory requirements. The first one is for case then the is no symbol with frequencey >=16Mb, for this case I can compact symbol and it frequency to unsigned int. The second case uses only frequency and binary search to reconstruct symbol. For ST I using a three cases if length less 8Mb, if is no symbol with frequency >=8Mb and the last case.

    Note: ST can be easy paralleled for two cores, just by starting processing of data from begin and from the end. The parallel version for more cores is also possible, but will be more tricky.
    Last edited by Gribok; 7th May 2010 at 03:51.
    Enjoy coding, enjoy life!

  20. #50
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post

    Call for linux tester

    To whom who have a Linux. I got an email that my multi-threading optimization in bsc is not working in Linux. Guys, could you please test bsc on Linux. I would appreciate also if someone can create a makefile and binaries for x86 and x64 for Linux, so I could upload them to web site. Thanks.
    Enjoy coding, enjoy life!

  21. #51
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,552
    Thanks
    767
    Thanked 684 Times in 370 Posts
    Quote Originally Posted by Gribok View Post
    To whom who have a Linux. I got an email that my multi-threading optimization in bsc is not working in Linux.
    may be he used compiler not supporting openmp?

  22. #52
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    New version 2.1.0 is available for testing on my website:
    [+]: Slightly faster compression and decompression on multi-core systems when compressing file as a single block.
    [+]: makefile for Linux systems. Tested on Linux Mint 9.
    [+]: Removed compilation errors for GNU C++ compiler.

    Note: this version is not compatible with previous one.
    Enjoy coding, enjoy life!

  23. #53
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Must-have console archiver

    Hi,
    from my user point of view bsc Version 2.1.0 is among the best(open-source; multithreaded; strong compression ratio for text; tunable) compressors, so far in my opinion.
    Yet, a thing remain to be added/improved(maybe via a tune option): better decompression ratio.
    While -b2,-pb2,-m0,-m1,-m2,-pm0,-pm1,-pm2 give better compression speed, decompression speed remains virtually the same, this crippled my greedy expectations, he-he.

    591,360 bsc_INTEL.exe (32bit, from bsc-2.1.0-x86.zip)
    229,376 bsc_Microsoft.exe (32bit, compiled with /Ox Visual C++ Toolkit 2003)

    Following is the test(on Pentium T3400 Merom-1M, 2 cores, 65nm, 2166 MHz) on all english OSHO books, this is a real-world text not some synthetic nonsense:

    206,908,949 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt
    101,903,677 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.LZ4
    078,630,023 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.QuickLZ
    062,740,017 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.AIN2.22
    062,620,125 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.ULZ
    062,218,572 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.PKZIP2.50
    055,891,414 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.zipDeflate64
    048,780,809 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.lzt [-29]
    042,723,543 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.lzt [-39]
    041,671,476 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.Plzip0.5
    040,108,833 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.PPMSI1 [-o5]
    035,109,880 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.4x4_4t [4t]
    032,766,449 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.bsc [b25]
    031,050,654 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.zpaq1.00 [c3]
    030,316,502 Folio VIP - Osho's Books on CD-ROM Part 1-2-3-4-5.txt.bsc [b200]

    Compression:
    Code:
    C:\WorkTemp\_Osho's Books_test>bsc_INTEL.exe e OSHO.TXT OSHO.TXT.bsc_INTEL
    This is bsc, Block Sorting Compressor. Version 2.1.0. 17 May 2010.
    Copyright (c) 2009-2010 Ilya Grebnov <Ilya.Grebnov@libbsc.com>.
    
    OSHO.TXT compressed 206908949 into 32766449 in 36.500 seconds.
    
    C:\WorkTemp\_Osho's Books_test>bsc_Microsoft.exe e OSHO.TXT OSHO.TXT.bsc_Microsoft
    This is bsc, Block Sorting Compressor. Version 2.1.0. 17 May 2010.
    Copyright (c) 2009-2010 Ilya Grebnov <Ilya.Grebnov@libbsc.com>.
    
    OSHO.TXT compressed 206908949 into 32766449 in 61.969 seconds.
    Decompression:
    Code:
    C:\WorkTemp\_Osho's Books_test>bsc_INTEL.exe d OSHO.TXT.bsc_INTEL OSHO.TXT
    This is bsc, Block Sorting Compressor. Version 2.1.0. 17 May 2010.
    Copyright (c) 2009-2010 Ilya Grebnov <Ilya.Grebnov@libbsc.com>.
    
    OSHO.TXT.bsc_INTEL decompressed 32766449 into 206908949 in 24.531 seconds.
    
    C:\WorkTemp\_Osho's Books_test>bsc_Microsoft.exe d OSHO.TXT.bsc_Microsoft OSHO.TXT
    This is bsc, Block Sorting Compressor. Version 2.1.0. 17 May 2010.
    Copyright (c) 2009-2010 Ilya Grebnov <Ilya.Grebnov@libbsc.com>.
    
    OSHO.TXT.bsc_Microsoft decompressed 32766449 into 206908949 in 47.188 seconds.
    A nifty console tool with a nifty web-site indeed,
    good luck, regards.

  24. #54
    Member Surfer's Avatar
    Join Date
    Mar 2009
    Location
    oren
    Posts
    203
    Thanks
    18
    Thanked 7 Times in 1 Post

    Red face

    I have a "noob" question - can MPI give advantage before OMP ?
    Or MPI can be used only for cluster systems ?

    Article in other languages:
    English
    Chinese
    French
    Last edited by Surfer; 28th May 2010 at 15:03.

  25. #55
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post
    Quote Originally Posted by Surfer View Post
    I have a "noob" question - can MPI give advantage before OMP ?
    Or MPI can be used only for cluster systems ?
    Yes, MPI for cluster systems. And I don't know any MPI based compressor.
    Enjoy coding, enjoy life!

  26. #56
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Gribok View Post
    Yes, MPI for cluster systems. And I don't know any MPI based compressor.
    No, MPI can be used on single machines too.
    But honestly I don't see the point.

  27. #57
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    412
    Thanks
    38
    Thanked 64 Times in 38 Posts
    @gribok
    you wrote: "And I don't know any MPI based compressor."
    ---
    here is one:
    ---
    http://compression.ca/mpibzip2/
    ---
    Parallel MPI BZIP2 (MPIBZIP2) by Jeff Gilchrist
    ---
    MPIBZIP2 is a parallel implementation of the
    bzip2 block-sorting file compressor that uses MPI
    and achieves significant speedup on cluster machines.
    The output of this version is fully compatible with bzip2 v1.0.2 or newer
    (ie: anything compressed with mpibzip2 can be decompressed with bzip2).
    MPIBZIP2 should work on any system
    that has a pthreads compatible C++ compiler (such as gcc).
    It has been tested on: Linux and Solaris.
    ---
    source from 2007-07-18 is downloadable:

    http://compression.ca/mpibzip2/mpibzip2-0.6.tar.gz
    ---

    best regards

    and thanks again for your wonderful bsc

  28. #58
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 2 Times in 1 Post

    bsc version 2.1.5

    New version 2.1.5 is available for testing. This is a service release with some minor changes:
    [+]: Improved multi-core systems support.
    [+]: Improved segmentation algorithm.
    [+]: Improved Linux systems support.

    You can download new version from my web site http://libbsc.com/default.aspx
    Enjoy coding, enjoy life!

  29. #59
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,552
    Thanks
    767
    Thanked 684 Times in 370 Posts
    Ilya, can you please technically explain changes in two last versions?

  30. #60
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,009
    Thanks
    399
    Thanked 397 Times in 152 Posts
    And can you please explain how you are able to use 1 GB block with libdivsufsort! Max allocation block is 2 GB = 512 M int elements... What kind of trick do you use? Do you use divbwt() for BWT construction, if so, is divbwt() able to deal with 1 GB block, or you use divsufsort() for suffix array construction?

Page 2 of 13 FirstFirst 123412 ... LastLast

Similar Threads

  1. Brute forcing Delta block size
    By SvenBent in forum Data Compression
    Replies: 2
    Last Post: 2nd May 2009, 12:44
  2. Block sorting for LZ compression
    By Bulat Ziganshin in forum Forum Archive
    Replies: 15
    Last Post: 14th April 2007, 15:37

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •