Results 1 to 20 of 20

Thread: Kanzi: Java, Go and C++ compressors

  1. #1
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts

    Kanzi: Java, Go and C++ compressors

    In case anyone is interested in Java and Go implementations of commonly used compression algorithms, take a look at the code https://code.google.com/p/kanzi/ (or on Github here: https://github.com/flanglet/kanzi/).

    For java, run java -cp kanzi.jar kanzi.app.BlockCompressor -help
    For Go, run go run BlockCompressor.go -help

    The code is open source (Apache License) and should be easy to read and extend. There are no dependencies, so cloning and building the code is trivial whatever the OS (Java 7 required). Let me know if you find any issue.
    Attached Files Attached Files

  2. Thanks (4):

    Bulat Ziganshin (23rd January 2017),Gonzalo (22nd November 2014),Matt Mahoney (22nd November 2014),xezz (22nd November 2014)

  3. #2
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    enwik8 and enwik9 compression results using a modified PAQ entropy coder:

    java -cp kanzi.jar kanzi.app.BlockCompressor -input=enwik9 -block=250000000 -transform=bwt -entropy=paq
    Encoding ...


    Encoding: 441847 ms
    Input size: 1000000000
    Output size: 175104595
    Ratio: 0.17510459
    Throughput (KB/s): 2210


    java -cp kanzi.jar kanzi.app.BlockDecompressor -input=enwik9.knz -output=enwik9.bak -overwrite
    Decoding ...


    Decoding: 373319 ms
    Input size: 175104595
    Output size: 1000000000
    Throughput (KB/s): 2615


    java -cp kanzi.jar kanzi.app.BlockCompressor -input=enwik8 -block=100m -transform=bwt -entropy=paq
    Encoding ...


    Encoding: 43404 ms
    Input size: 100000000
    Output size: 20864051
    Ratio: 0.20864052
    Throughput (KB/s): 2249


    java -cp kanzi.jar kanzi.app.BlockDecompressor -input=enwik8.knz -output=enwik8.bak -overwrite
    Decoding ...


    Decoding: 36909 ms
    Input size: 20864051
    Output size: 100000000
    Throughput (KB/s): 2645






    go run BlockCompressor.go -input=enwik9 -block=250000000 -transform=bwt -entropy=paq
    Encoding ...


    Encoding: 755792 ms
    Input size: 1000000000
    Output size: 175104595
    Ratio: 0.175105
    Throughput (KB/s): 1292


    go run BlockDecompressor.go -input=enwik9.knz -output=enwik9.bak -overwrite
    Decoding ...


    Decoding: 673219 ms
    Input size: 175104595
    Output size: 1000000000
    Throughput (KB/s): 1450


    go run BlockCompressor.go -input=enwik8 -block=100m -transform=bwt -entropy=paq
    Encoding ...


    Encoding: 75203 ms
    Input size: 100000000
    Output size: 20864051
    Ratio: 0.208641
    Throughput (KB/s): 1298




    go run BlockDecompressor.go -input=enwik8.knz -output=enwik8.bak -overwrite
    Decoding ...


    Decoding: 66339 ms
    Input size: 20864051
    Output size: 100000000
    Throughput (KB/s): 1472

  4. #3
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Quick Update:
    - Bug fixes (especially concurrent code)
    - Various speed improvements
    - Block size increase to 1GB (can now process enwik9 in one block)
    - Implementation of Ilya's BCM
    - Better logging

    Here:https://github.com/flanglet/kanzi


    enwik9

    java -Xmx10g -Xms10g -cp kanzi.jar kanzi.app.BlockCompressor -input=e:\temp\enwik9 -output=r:\enwik9.knz -overwrite -block=1000000000 -transform=bwt entropy=cm -verbose=3


    Block 1: 1000000000 => 1000000004 [186569 ms] => 164237784 [133402 ms] (16%)


    Encoding: 320605 ms
    Input size: 1000000000
    Output size: 164237795
    Ratio: 0.1642378
    Throughput (KB/s): 3045

    java -Xmx10g -Xms10g -cp kanzi.jar kanzi.app.BlockDecompressor -input=r:\enwik9.knz -output=r:\enwik9.bak -overwrite -verbose=3


    Block 1: 164237784 => 1000000004 [127475 ms] => 1000000000 [124938 ms]


    Decoding: 252735 ms
    Input size: 164237795
    Output size: 1000000000
    Throughput (KB/s): 3863

    enwik8

    java -Xmx4g -Xms4g -cp kanzi.jar kanzi.app.BlockCompressor -input=e:\temp\enwik8 -output=r:\enwik8.knz -overwrite -block=100000000 -transform=bwt -entropy=cm -verbose=3

    Block 1: 100000000 => 100000004 [14433 ms] => 20792256 [12400 ms] (20%)


    Encoding: 26913 ms
    Input size: 100000000
    Output size: 20792267
    Ratio: 0.20792268
    Throughput (KB/s): 3628

    java -Xmx4g -Xms4g -cp kanzi.jar kanzi.app.BlockDecompressor -input=r:\enwik8.knz -output=r:\enwik8.bak -overwrite verbose=3


    Block 1: 20792256 => 100000004 [13243 ms] => 100000000 [8844 ms]


    Decoding: 22133 ms
    Input size: 20792267
    Output size: 100000000
    Throughput (KB/s): 4412

  5. #4
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Ported the code to C++ with targets for VS 2008 (win32), VS 2015 (win64) and g++/Linux.Here:https://github.com/flanglet/kanzi

  6. #5
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Making progress with release 1.2.


    https://github.com/flanglet/kanzi

    Code:
    ./kanzi -c -i enwik8 -o enwik8.knz -f -b 100m -l 5 -v 1
    Kanzi 1.2 (C) 2017,  Frederic Langlet
    
    
    Encoding: 100000000 => 19936705 bytes in 32051.5 ms
    
    
    ./kanzi -d -i enwik8.knz -o enwik8.bak -f -v 1
    Kanzi 1.2 (C) 2017,  Frederic Langlet
    
    
    Decoding: 19936705 => 100000000 bytes in 33060.7 ms

  7. #6
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Release 1.3

    https://github.com/flanglet/kanzi/releases


    Code:
    kanzi -c -i /disk1/compression/silesia/. -o /tmp/silesia -f -l 5 -b 100m -j 8
    
    Kanzi 1.3 (C) 2018,  Frederic Langlet
    
    12 files to compress
    
    Encoding /disk1/compression/silesia/reymont: 6627202 => 1007852 bytes in 6002.16 ms
    Encoding /disk1/compression/silesia/ooffice: 6152192 => 1764936 bytes in 6806.85 ms
    Encoding /disk1/compression/silesia/mr: 9970564 => 2192364 bytes in 7321.18 ms
    Encoding /disk1/compression/silesia/dickens: 10192446 => 2160116 bytes in 7697.83 ms
    Encoding /disk1/compression/silesia/xml: 5345280 => 349208 bytes in 2475.83 ms
    Encoding /disk1/compression/silesia/osdb: 10085684 => 2295860 bytes in 13039.2 ms
    Encoding /disk1/compression/silesia/samba: 21606400 => 3225368 bytes in 13191.7 ms
    Encoding /disk1/compression/silesia/sao: 7251944 => 4525916 bytes in 9098.5 ms
    Encoding /disk1/compression/silesia/nci: 33553445 => 1474541 bytes in 15756.2 ms
    Encoding /disk1/compression/silesia/x-ray: 8474240 => 3704188 bytes in 9039.04 ms
    Encoding /disk1/compression/silesia/webster: 41458703 => 6154213 bytes in 19074.4 ms
    Encoding /disk1/compression/silesia/mozilla: 51220480 => 12667469 bytes in 29727.1 ms
    
    Total encoding time: 29729 ms
    Total output size: 41522031 bytes
    Compression ratio: 0.195915
    Code:
    kanzi -c -i /disk1/ws/enwik8 -f -l 5 -b 100m
    
    Kanzi 1.3 (C) 2018,  Frederic Langlet
    
    1 file to compress
    
    Encoding /disk1/ws/enwik8: 100000000 => 19576485 bytes in 32659.1 ms
    
    
    /kanzi -d -i /disk1/ws/enwik8.knz -f
    
    Kanzi 1.3 (C) 2018,  Frederic Langlet
    
    1 file to decompress
    
    Decoding /disk1/ws/enwik8.knz: 19576485 => 100000000 bytes in 34409.1 ms
    Attached Files Attached Files

  8. #7
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts

    Release 1.4

    (Obviously the name of this thread is no longer valid since there has been a C++ version for a while).

    Changes:

    - Bug fixes
    - Code reorganization: split into 3 repositories (1 per language): kanzi. kanzi-go, kanzi-cpp.
    - New LZ based compression level 2
    - Compression improved in (ex) levels 1, 3 and 5. The highest level is also faster
    - First stage allows up to 8 functions (instead of 4)
    - Jar can be now be built with maven.
    - Go code is now go gettable

    Tests on i7-7700K @4.20GHz, 32GB RAM, Ubuntu 18.04

    Silesia C++ results: https://github.com/flanglet/kanzi-cpp
    Silesia Java results: https://github.com/flanglet/kanzi
    Silesia Go results: https://github.com/flanglet/kanzi-go

    Some enwik8 results for C++

    Code:
    kanzi -c -i /disk1/ws/enwik8 -f -b 12500k -j 8 -l 2
    Encoding /disk1/ws/enwik8: 100000000 => 26527630 bytes in 1448 ms
    
    kanzi -d -i /disk1/ws/enwik8.knz -f -j 8
    Decoding /disk1/ws/enwik8.knz: 26527630 => 100000000 bytes in 941 ms
    
    kanzi -c -i /disk1/ws/enwik8 -f -b 100m -l 6
    Encoding /disk1/ws/enwik8: 100000000 => 19464686 bytes in 30727 ms
    
    kanzi -d -i /disk1/ws/enwik8.knz -f
    Decoding /disk1/ws/enwik8.knz: 19464686 => 100000000 bytes in 31364 ms
    Attached Files Attached Files
    Last edited by hexagone; 16th May 2018 at 21:41. Reason: add missing word

  9. Thanks:

    Bulat Ziganshin (23rd May 2018)

  10. #8
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Can Kanzi generate gzip or DEFLATE files? If so, it would be interesting to compare the Go version to Klaus Post's Go implementation of gzip and DEFLATE: https://github.com/klauspost/compress/

    He uses some Go assembly, specifically the SSE 4.2 instruction set, and possibly others.

    If Kanzi is its own thing, like its own file or archive format, then it would be useful to compare it to LZTurbo.

  11. #9
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Kanzi has its own header so it does not generate deflate compatible files. Frankly, most languages have decent deflate implementations, so there is not much I can provide in this area.
    With regards to benchmarks, it is certainly possible, the code is available.
    Also, the previous release contained mostly bug fixes but a big focus for the next release is performance.
    We will see how improvement much I can squeeze.

  12. #10
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Release 1.5


    Changes:

    - Two new levels ( 2 and 8 ) have been introduced to remove gaps in the compression ratio/time curve.
    - Many speed improvements for compression ratios similar to 1.4.
    - Better text compression at level 0.
    - Inverse BWT is now multi-threaded.


    Silesia C++ results: https://github.com/flanglet/kanzi-cpp
    Silesia Java results: https://github.com/flanglet/kanzi
    Silesia Go results: https://github.com/flanglet/kanzi-go



    Code:
      
    Some enwik8 results for C++ (g++ 7.3.0)
    Tests on i7-7700K @4.20GHz, 32GB RAM, Ubuntu 18.04
    
                                           comp  decomp  size
    kanzi -b 25m -l 1 -j 4                    0.56   0.34   36453906                                  
    zip 3.0      -9                           4.70   0.59   36445403
    kanzi -b 12500k -l 2 -j 8                 0.73   0.47   29674729
    bzip2  1.0.6 -9                           5.84   2.52   29008758
    brotli 1.0.5 -9                          64.68   0.84   28879185
    zstd 1.3.3   -19                         30.27   0.19   27659086
    kanzi -b 12500k -l 3 -j 8                 1.33   0.71   26582674
    brotli 1.0.5 -Z                         430.95   0.73   25742001
    kanzi -b 12500k -l 4 -j 8                 1.39   0.80   25046424
    lzma 5.2.2 -9                            54.75   1.00   24861357
    lzturbo 1.2 -49 -b100                    82.19   1.24   24356021
    kanzi -b 25m  -l 4 -j 4                   1.66   1.28   24157983
    kanzi -b 100m -l 5 -j 8                   7.65   3.28   21934022
    bsc -b100                                 5.51   1.33   20920018
    kanzi -b 100m -l 6 -j 8                  14.32   7.81   20791492
    kanzi -b 100m -l 7                       22.48  22.44   19613190
    kanzi -b 100m -l 8                       29.53  29.72   19284434
    xwrt 3.2 -b100 -l14                      51.39  53.37   18721755
    Attached Files Attached Files
    Last edited by hexagone; 25th December 2018 at 22:27. Reason: Added attachment

  13. Thanks (2):

    algorithm (25th December 2018),SolidComp (31st December 2018)

  14. #11
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    868
    Thanks
    242
    Thanked 323 Times in 196 Posts
    Quote Originally Posted by hexagone View Post
    Release 1.5


    Changes:

    - Two new levels ( 2 and 8 ) have been introduced to remove gaps in the compression ratio/time curve.
    - Many speed improvements for compression ratios similar to 1.4.
    - Better text compression at level 0.
    - Inverse BWT is now multi-threaded.


    Silesia C++ results: https://github.com/flanglet/kanzi-cpp
    Silesia Java results: https://github.com/flanglet/kanzi
    Silesia Go results: https://github.com/flanglet/kanzi-go



    Code:
      
    Some enwik8 results for C++ (g++ 7.3.0)
    Tests on i7-7700K @4.20GHz, 32GB RAM, Ubuntu 18.04
    
                                           comp  decomp  size
    kanzi -b 25m -l 1 -j 4                    0.56   0.34   36453906                                  
    zip 3.0      -9                           4.70   0.59   36445403
    kanzi -b 12500k -l 2 -j 8                 0.73   0.47   29674729
    bzip2  1.0.6 -9                           5.84   2.52   29008758
    brotli 1.0.5 -9                          64.68   0.84   28879185
    zstd 1.3.3   -19                         30.27   0.19   27659086
    kanzi -b 12500k -l 3 -j 8                 1.33   0.71   26582674
    brotli 1.0.5 -Z                         430.95   0.73   25742001
    kanzi -b 12500k -l 4 -j 8                 1.39   0.80   25046424
    lzma 5.2.2 -9                            54.75   1.00   24861357
    lzturbo 1.2 -49 -b100                    82.19   1.24   24356021
    kanzi -b 25m  -l 4 -j 4                   1.66   1.28   24157983
    kanzi -b 100m -l 5 -j 8                   7.65   3.28   21934022
    bsc -b100                                 5.51   1.33   20920018
    kanzi -b 100m -l 6 -j 8                  14.32   7.81   20791492
    kanzi -b 100m -l 7                       22.48  22.44   19613190
    kanzi -b 100m -l 8                       29.53  29.72   19284434
    xwrt 3.2 -b100 -l14                      51.39  53.37   18721755
    run brotli with --large_window 30 for another 5-10 % of compression

  15. #12
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    run brotli with --large_window 30 for another 5-10 % of compression
    Code:
    brotli 1.0.7 --large_window=30          435.10   0.95   24810180

  16. #13
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Are those numbers MB per sec?

    Which version of Java are you using? Is Java 7 a minimum requirement or do you have to use Java 7 specifically?

  17. #14
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Quote Originally Posted by SolidComp View Post
    Are those numbers MB per sec?

    Which version of Java are you using? Is Java 7 a minimum requirement or do you have to use Java 7 specifically?
    Java 7 is a minimum not a required version. I am using openJDK 11.

    The numbers in this forum are encoding time (sec), decoding time (sec), size (bytes) for C++

  18. Thanks:

    SolidComp (31st December 2018)

  19. #15
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Quote Originally Posted by hexagone View Post
    Java 7 is a minimum not a required version. I am using openJDK 11.

    The numbers in this forum are encoding time (sec), decoding time (sec), size (bytes) for C++
    Those are seconds? Dude, these are shocking numbers. Kanzi is a significant advance, right? What am I missing? Why isn't everybody talking about this?

    Is it easy to use like gzip? What's the file format it produces?

  20. #16
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Quote Originally Posted by SolidComp View Post
    Those are seconds? Dude, these are shocking numbers. Kanzi is a significant advance, right? What am I missing? Why isn't everybody talking about this?

    Is it easy to use like gzip? What's the file format it produces?
    Java numbers are here: https://github.com/flanglet/kanzi

    I do not think the numbers are shocking in any way. The test computer is decently fast. There are a few things to say though: first, the sliesia test shows both mono and multi-threaded numbers. Not all other compressors are multi-threaded so the numbers may or may not be relevant when comparing with XYZ. Second, multi-threaded mode requires more memory (Java is a memory hog). Third, many other compressors are LZ based which makes the decompression really fast but can make the compression slow (at higher ratios). See how fast Zstd, LZMA or Brotli can decompress. I chose to change the compression based on level (ie compression ratio): LZ, ROLZ, BWT, CM. As a result compression/decompression is more symmetric. Also, gzip is outdated at this point but other compressors like BSC, BCM, Zstd, etc... are still faster/better. One more thing, the text filter helps tremendously on text files ( EG. enwik8 ). Some other compressors have one, others do not.

    Yes, it should be easy to use. All options are documented (-h for help). The format is proprietary: I must warn you that the bitstream has been changing at each release (which requires a decompression/compression cycle). I will freeze it probably at release 2.0. Also, I think the code works pretty well but I do not guarantee that there is no bug. Keep a copy of your data if you want to test kanzi. Finally, since I derived parts of the code from others work, the code is open source ... feel free to modify it.

  21. #17
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    506
    Thanks
    187
    Thanked 177 Times in 120 Posts
    It still compares pretty well vs bsc. Not quite matching it, but a decent effort and bsc is quite an efficient bar to be matching. Higher compression modes look like it may be doing well on LTCB vs other tools for speed/size tradeoff.

    Looks good! Thanks.

  22. #18
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Release 1.6


    Changes:

    - Bug fixes & code cleanup
    - Decompression speed improvements, especially level 4 and 5 (new inverse BWT)
    - Better compression ratio at level 1, 2, 5 and 8.
    - New Sorted Ranks Transform
    - Added support for C++17


    Silesia C++ results: https://github.com/flanglet/kanzi-cpp
    Silesia Java results: https://github.com/flanglet/kanzi
    Silesia Go results: https://github.com/flanglet/kanzi-go

    Code:
    Some enwik8 results for C++ (g++ 8.3.0)
    Tests on i7-7700K @4.20GHz, 32GB RAM, Ubuntu 18.04
    
    zip 3.0      -9                           4.70   0.59   36445403
    kanzi -b 25m -l 1 -j 4                    0.50   0.35   34867696
    bzip2  1.0.6 -9                           5.84   2.52   29008758
    brotli 1.0.5 -9                          64.68   0.84   28879185
    kanzi -b 12500k -l 2 -j 8                 0.73   0.44   28607773
    zstd 1.3.3   -19                         30.27   0.19   27659086
    kanzi -b 12500k -l 3 -j 8                 1.23   0.60   26784998
    brotli 1.0.5 -Z                         430.95   0.73   25742001
    kanzi -b 12500k -l 4 -j 8                 1.36   0.74   25045124 
    lzma 5.2.2 -9                            54.75   1.00   24861357
    brotli 1.0.7 --large_window=30          435.10   0.95   24810180
    lzturbo 1.2 -49 -b100                    82.19   1.24   24356021
    kanzi -b 25m  -l 4 -j 8                   1.62   0.88   24157765
    kanzi -b 100m -l 4 -j 8                   5.38   1.72   22511652
    kanzi -b 100m -l 5 -j 8                   7.75   3.14   21301346 
    bsc -b100                                 5.51   1.33   20920018
    kanzi -b 100m -l 6 -j 8                  14.32   7.84   20791496
    kanzi -b 100m -l 7                       19.38  19.53   19597394 
    kanzi -b 100m -l 8                       27.05  27.65   19163098 
    xwrt 3.2 -b100 -l14                      51.39  53.37   18721755





    Attached Files Attached Files

  23. Thanks (5):

    Bulat Ziganshin (7th July 2019),Cyan (10th July 2019),dnd (7th July 2019),Mike (7th July 2019),Stephan Busch (7th July 2019)

  24. #19
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Code:
    Some encoding tests vs. ZCM v.0.93
    Win 7, i7-2600 @ 3.4 GHz, 16 MB 
    
    R:\>AcuTimer.exe zcmx64.exe a -m7 calgary.zcm calgary\*
    Archive is R:\calgary.zcm
    Compressed 3251493 bytes to 810362 bytes
    Elapsed Time:  00  00:00:03.902  (3.902 Seconds)
    
    r:\bin\kanzi.exe -c -i r:\calgary -f -b 50m -l 7
    Total encoding time: 3698 ms
    Total output size: 743760 bytes
    
    
    
    (Peak Memory is 1.1 GB)
    R:\>AcuTimer.exe zcmx64.exe a -m7 enwik8.zcm enwik8
    Archive is R:\enwik8.zcm
    Compressed 100000000 bytes to 19669596 bytes
    Elapsed Time:  00  00:00:38.090  (38.090 Seconds)
    
    (Peak Memory is 1 GB)
    R:\>r:\bin\kanzi.exe -c -i r:\enwik8 -f-b 100m -l 7
    Encoding r:\enwik8: 100000000 => 19597394 bytes in 36516 ms
    
    
    
    (Peak Memory is 1.5+ GB)
    R:\>AcuTimer.exe zcmx64.exe a -m7 silesia.zcm silesia\* 
    Archive is R:\silesia.zcm
    Compressed 211938580 bytes to 41501382 bytes
    Elapsed Time:  00  00:01:09.900  (69.900 Seconds)
    
    (Peak Memory is 850MB)
    R:\>r:\bin\kanzi.exe -c -i r:\silesia -f -b 100m -l 7
    Total encoding time: 93953 ms
    Total output size: 41892099 bytes
    
    (Peak Memory is 1.6+ GB)
    R:\>r:\bin\kanzi.exe -c -i r:\silesia -f -b 100m -l 7 -j 2
    Total encoding time: 56561 ms
    Total output size: 41892099 bytes

  25. #20
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    Release 1.7


    Changes:

    - Bug fixes & code cleanup
    - Slightly better compression throughout
    - Modified level 6 (faster for text files)
    - Better handling of small files


    Silesia C++ results: https://github.com/flanglet/kanzi-cpp
    Silesia Java results: https://github.com/flanglet/kanzi
    Silesia Go results: https://github.com/flanglet/kanzi-go

    Code:
    enwik8
    
    zip 3.0  -9                               4.70   0.59   36445403
    lzfse 1.0                                 4.66   0.82   36157828
    kanzi -b 25m -l 1 -j 4                    0.54   0.39   34532276
    lrzip 0.631 -b -p 12                      3.91   1.29   29122579
    bzip2  1.0.6 -9                           5.84   2.52   29008758
    brotli 1.0.5 -9                          64.68   0.84   28879185
    kanzi -b 25m -l 2 -j 4                    0.72   0.48   27962342
    lrzip 0.631 -p 12                        11.36   0.96   27228013
    orz 1.5.0                                 4.71   0.95   27148974
    zstd 1.4.5   -19                         39.71   0.18   26960372
    kanzi -b 12500k -l 3 -j 8                 1.07   0.64   26741570
    brotli 1.0.5 -Z                         430.95   0.73   25742001
    lzham 0x1010 -m4                         20.35   0.50   25066677
    kanzi -b 12500k -l 4 -j 8                 1.29   0.76   24989286 
    lzma 5.2.2 -9                            54.75   1.00   24861357
    brotli 1.0.7 --large_window=30          435.10   0.95   24810180
    lzturbo 1.2 -49 -b100                    82.19   1.24   24356021
    kanzi -b 25m  -l 4 -j 8                   1.59   0.94   24108751
    kanzi -b 100m -l 4 -j 8                   5.52   1.89   22478636
    lrzip 0.631 -z -p 12                     18.08  15.18   22197072
    kanzi -b 100m -l 5 -j 8                   7.93   3.31   21275446
    bsc -b100                                 5.51   1.33   20920018
    kanzi -b 100m -l 6 -j 8                   9.98   5.78   20869366
    kanzi -b 100m -l 7                       18.98  18.81   19570938
    kanzi -b 100m -l 8                       27.18  27.73   19141858
    xwrt 3.2 -b100 -l14                      51.39  53.37   18721755
    
    
    
    calgary
    
    1.6 Level 2
    Total encoding time: 91 ms
    Total output size: 1077662 bytes
    1.7 Level 2
    Total encoding time: 66 ms
    Total output size: 1012784 bytes
    
    
    1.6 Level 7
    Total encoding time: 1991 ms
    Total output size: 744184 bytes
    1.7 Level 7
    Total encoding time: 808 ms
    Total output size: 739624 bytes
    ‚Äč
    
    1.6 Level 8 
    Total encoding time: 3849 ms
    Total output size: 735236 bytes
    1.7 Level 8
    Total encoding time: 1382 ms
    Total output size: 733188 bytes
    Attached Files Attached Files
    Last edited by hexagone; 23rd February 2020 at 07:27.

  26. Thanks (2):

    dnd (22nd February 2020),Mike (22nd February 2020)

Similar Threads

  1. Replies: 0
    Last Post: 14th March 2015, 13:21
  2. Oracle blocking Java installs in Russia
    By Matt Mahoney in forum The Off-Topic Lounge
    Replies: 3
    Last Post: 11th August 2014, 23:19
  3. Java port of TarsaLZP
    By Piotr Tarsa in forum Data Compression
    Replies: 19
    Last Post: 8th July 2009, 06:46
  4. Test set: Java application
    By m^2 in forum Data Compression
    Replies: 4
    Last Post: 24th October 2008, 00:06
  5. JAVA vs C++
    By Nania Francesco in forum Forum Archive
    Replies: 5
    Last Post: 20th January 2008, 21:07

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •