Page 1 of 3 123 LastLast
Results 1 to 30 of 71

Thread: Etincelle - new compression

  1. #1
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts

    Etincelle - new compression program

    Hi

    I wish to offer to your scrutiny an early look to the first public release of etincelle, a new fast compression software, with some interesting compression ratio / speed properties.

    [Update] : latest version can be found at webpage :
    http://phantasie.tonempire.net/pc-co...e-t102.htm#160
    Latest version is RC2 :
    - improved speed and compression rate

    [Older versions]
    RC1 :
    - Default Dictionary 128MB
    - Better compression on binary files
    - Benchmark mode accepts large files

    beta 4 :
    - Long repetitions detection and support

    beta 3 :
    - major speed gains for files containing incompressible segments

    beta 2 :
    - small compression and speed gains
    - minor bugfix, on error message for insufficient memory

    Beta 1 :
    - selectable dictionary size (from 1MB to 3GB)

    Alpha 3 : http://sd-1.archive-host.com/membres...lle-alpha3.zip
    - drag'n'drop interface support
    - benchmark mode support

    Alpha2 : http://sd-1.archive-host.com/membres...lle-alpha2.zip
    - improved global speed
    - bugfix on decoding i/o

    Alpha1 version can be downloaded here : http://sd-1.archive-host.com/membres.../Etincelle.zip

    It gets close to 90MB/s on my system, while providing better compression than zip's best modes. An especially good use-case seems to be "Mail Archives", like outlook.pst files, which are plentyfull of identical attached files, thanks to etincelle's capability to find matches at larges distances (up to 1GB in this version).

    For your comments and evaluation. There are still features & controls i want to add, but main properties (speed and compression ratio) should be quite close to final results.

    Edit : Updated graphical comparison of fast compressors :
    http://phantasie.tonempire.net/pc-co...rk-t96.htm#149

    Regards
    Last edited by Cyan; 23rd April 2010 at 15:06.

  2. #2
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Your new compression algorithm seems to be really good in it's first version.
    I used a tar file I tested precomp with, which includes mostly eclipse and maybe part of a game folder. 512mb at all.

    Etincelle
    Code:
    Compression completed : 512.0MB --> 332.0MB  (64.84%) (348089108 Bytes)
    
    Compression Time : 13.62s ==> 39.4MB/s
    Total Time : 33.39s   ( HDD Read : 17.11s / HDD Write : 2.65s / CPU : 13.62s )
    time: elapsed: 33390ms, kernel: 1250ms, user: 13593ms
    SlugX
    Code:
    524288.00 KB -> 330011.43 KB (62.94%, 337931705 bytes)
    time: elapsed: 32343ms, kernel: 1453ms, user: 18656ms
    It is at eye level with SlugX. Don't take timing too serious, I ran them with many programs parallel.

  3. #3
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Thanks for testing, Simon.
    Indeed, SlugX is a tough target to reach, and i'm not trying to beat it on ratio, i'm interested in keeping a speed advantage at this stage.

    Speaking about speed, Etincelle uses a 2MB table for storing pointers. For modern processors which are plentyfull of cache, this works well.

    But i suspect a speed hit for systems with less cache (2MB or even less). Even more so for older processors.
    How much could be the speed impact, i don't know. Maybe it is not that big, maybe it makes a real difference.
    A work-around could be to introduce new modes using less memory (obviously in exchange for a hit on compression ratio).

    Alas, this is something i cannot test alone with my only Core 2. I need your advises and measures to challenge this hypothesis.

    Simon, would you be so kind as telling which size is your L2 cache in your test ?

    Best Regards
    Last edited by Cyan; 28th March 2010 at 16:02.

  4. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    Cyan, it adds 50-100 cycles to almost every table access. it's easy to test - just increase dictionary and table 8x or so

  5. #5
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    788
    Thanks
    64
    Thanked 274 Times in 192 Posts

  6. #6
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts

    Hi!

    I wanted to signal that my program of test has signalled error on these files of new MOC 2010..

    http://www.random.org/files/2009/2009-12-27.bin
    http://www.random.org/files/2009/2009-12-26.bin

    please you can give confirmation!

  7. #7
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Thanks for testings this early version.

    Nania : Yes you are right, this is a bug in the decompression i/o routine, which occurs when the source file is an exact multiple of the read_buffer (1MB in this version). The compressed file is okay, it is the file reader which badly handle it. This will be corrected into Alpha2.

    Regarding L2 Cache effect : 50-100 cycles to reach main memory seems quite a lot indeed. I will try your suggestion, Bulat, by artificially increasing pointer table memory size beyond my processor's capacity, and witness the results.

    Regards

  8. #8
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    You'll find in below link an updated Alpha 2 version of Etincelle.

    It corrects the bug reported by Nania (thanks Nania), and provides a bit of speed improvements on I/O, thanks to some I/O corrections & simplifications.
    Gains are in the range of 1sec/GB for "global compression time".
    Compression algorithm has not changed, so CPU gains are limited (in the range of 0.2sec per GB).

    http://sd-1.archive-host.com/membres...lle-alpha2.zip

    Regards

  9. #9
    Member
    Join Date
    May 2009
    Location
    China
    Posts
    36
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Can't download from the website.
    Could you attach it in this post?

  10. #10
    Member Fu Siyuan's Avatar
    Join Date
    Apr 2009
    Location
    Mountain View, CA, US
    Posts
    176
    Thanks
    10
    Thanked 17 Times in 2 Posts
    Quote Originally Posted by ddfox View Post
    Can't download from the website.
    Could you attach it in this post?
    Here many site are unaccessible, usually I would try
    http://www.7daili.com/

    It's a webpage proxy

  11. #11
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Hello

    You'll find at below link an updated Alpha 3 version of Etincelle, featuring the following changes :
    - Drag'n'drop interface : just slide a file onto the program, it will get compressed automatically. The resulting compressed file will be named "filename.eti". It works the same for decompression too.
    - Benchmark mode : completed, also provide decoding time & speed
    http://sd-1.archive-host.com/membres...lle-alpha3.zip

    Regards
    Last edited by Cyan; 30th March 2010 at 17:17.

  12. #12
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts

  13. #13
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts

    Thumbs up

    Thanks for testing, Matt !

    I will probably submit to you a better version of etincelle for your LTCB on a later release. This one is still rough on the edges.

    For the time being, there is now a new beta 1 version of etincelle, which adds the important capability to select dictionary buffer size.

    http://sd-1.archive-host.com/membres...elle-beta1.zip

    Etincelle allocates the smaller value between Set memory size or FileSize.
    Default Set memory is 1GB.

    Compression wise, more memory can only bring better compression, but benefits greatly diminish with increased memory size. On average, it seems a buffer of 64MB brings most of the compression performance (there are of course some exceptions to this rule).

    On the other side, the decoder needs as much memory as the encoder; so you may be interested in lower-memory requirements (up to 1MB).

    You may notice that etincelle resists relatively well to decreased memory size, which indicates that most hits are short-distance ones.



    Regards

  14. #14
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    788
    Thanks
    64
    Thanked 274 Times in 192 Posts
    Beta 1 results enwik9:

    Output size Comp time Decomp time
    324,853,068 bytes 21.138 sec 5.809 sec -m1
    320,701,441 bytes 19.298 sec 7.651 sec -m2
    318,215,057 bytes 19.481 sec 5.414 sec -m4
    316,746,237 bytes 18.648 sec 6.021 sec -m8
    315,896,873 bytes 19.916 sec 5.799 sec -m16
    315,409,228 bytes 19.365 sec 6.183 sec -m32
    315,128,666 bytes 21.888 sec 6.435 sec -m64
    314,964,910 bytes 23.200 sec 5.933 sec -m128
    314,865,683 bytes 25.882 sec 5.978 sec -m256
    314,816,813 bytes 24.797 sec 14.386 sec -m512
    314,801,712 bytes 24.952 sec 24.439 sec -m1024
    314,801,712 bytes 24.948 sec 22.827 sec -m1536
    314,801,712 bytes 24.858 sec 23.754 sec -m1750

  15. #15
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    200
    Thanks
    41
    Thanked 36 Times in 12 Posts
    Congratuations! - Matt's timings show that nobody compresses enwik9 faster whilst achieving that ratio.

  16. #16
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Thanks Willvarfar

    And many thanks Sportman for your precise results, as they are very instructive.

    They seem to indicate that there is a "moot point" for compression time at -m8 or -m32; while decompression time gets a serious slowdown beyond -m256.

    I guess we are here talking about Total time, which are dominated by disk I/O with normal HDD (ie not SSD nor RamDrive).
    And a side comment is that you probably have an excellent CPU, judging by your impressive decompression timings

    I tried to reproduce your results, and got some differences.
    Since I/O timing were highly inaccurate on my system, i decided to use Bulat's recommendation on setting up a dedicated empty partition for benchmarking.
    Then i started to get results with a bit less randomness.

    On compression time, i couldn't get really very different timing, whatever the memory size.
    They were all between 15.1s and 15.6s on Enwik9. Granted, larger memory sizes were more on the upper part, but this remained within error margin.
    Note also that the dedicated partition was key to drive down times, by a pretty large margin (from around 20sec to 15sec).

    On the other hand, decompression timings were indeed affected by large memory size, as suggested in your post.
    They went from 13sec (256MB and below) to 18.5sec (512sec and up), which is definately noticeable.
    This was unexpected. CPU decompression timing remained stable at 7.5sec. So all effects are on the I/O side.

    For your comments
    Last edited by Cyan; 1st April 2010 at 03:24.

  17. #17
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Considering LTCB description about your compressor (ROLZ order 1 + huffman), I would like to advice to dump your match position histogram (match distances from current position - aka backward offset). I'm sure you won't get too many offsets beyond 64 MiB. Though, in here, your ROLZ buffer size is very important. I think, this could be a good indication to decide your default buffer size. I don't think it's a good idea that > 64 MiB memory for providing a few extra matches.
    Last edited by osmanturan; 1st April 2010 at 11:59. Reason: Spelling...
    BIT Archiver homepage: www.osmanturan.com

  18. #18
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,505
    Thanks
    741
    Thanked 665 Times in 359 Posts
    why you don't use ramdisk for benchs? optimizing i/o is completely separate task from optimizing the compressor

  19. #19
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    788
    Thanks
    64
    Thanked 274 Times in 192 Posts
    I have redo the test at a slower but more modern CPU (Nahelem) Intel Xeon E5530 2.4GHz (2.53GHz turbo mode) and a much quicker file system (6 disks RAID5 for write and 8 disks RAID5 for read by compression and the opposite for decompression).

    Beta 1 results enwik9:

    Output size Comp time Decomp time
    324,853,068 bytes 11.700 sec 7.965 sec -m1
    320,701,441 bytes 11.584 sec 7.965 sec -m2
    318,215,057 bytes 11.803 sec 8.191 sec -m4
    316,746,237 bytes 11.970 sec 8.632 sec -m8
    315,896,873 bytes 11.932 sec 8.217 sec -m16
    315,409,228 bytes 11.999 sec 8.592 sec -m32
    315,128,666 bytes 12.665 sec 8.318 sec -m64
    314,964,910 bytes 12.038 sec 8.289 sec -m128
    314,865,683 bytes 12.235 sec 8.399 sec -m256
    314,816,813 bytes 12.158 sec 8.398 sec -m512
    314,801,712 bytes 12.400 sec 8.763 sec -m1024
    314,801,712 bytes 12.307 sec 9.093 sec -m1536
    314,801,712 bytes 12.336 sec 8.650 sec -m1750
    314,801,712 bytes 12.303 sec 8.908 sec -m2048
    No output -m4096
    No output -m8192
    No output -m16384

    Full log:

    324,853,068 -m1

    Compression Time : 11.14s ==> 89.8MB/s
    Total Time : 11.67s ( HDD Read : 0.39s / HDD Write : 0.14s / CPU : 11.14s )
    Elapsed Time: 00 00:00:11.700 (11.700 Seconds)

    Decoding Time : 7.14s ==> 139MB/s
    Total Time : 7.94s ( HDD Read : 0.16s // HDD Write : 0.50s )
    Elapsed Time: 00 00:00:07.965 (7.965 Seconds)

    --

    320,701,441 -m2

    Compression Time : 11.03s ==> 90.7MB/s
    Total Time : 11.56s ( HDD Read : 0.38s / HDD Write : 0.16s / CPU : 11.03s )
    Elapsed Time: 00 00:00:11.584 (11.584 Seconds)

    Decoding Time : 7.14s ==> 139MB/s
    Total Time : 7.94s ( HDD Read : 0.16s // HDD Write : 0.50s )
    Elapsed Time: 00 00:00:07.965 (7.965 Seconds)

    --

    318,215,057 -m4

    Compression Time : 11.31s ==> 88.4MB/s
    Total Time : 11.78s ( HDD Read : 0.36s / HDD Write : 0.09s / CPU : 11.31s )
    Elapsed Time: 00 00:00:11.803 (11.803 Seconds)

    Decoding Time : 7.62s ==> 131MB/s
    Total Time : 8.16s ( HDD Read : 0.08s // HDD Write : 0.33s )
    Elapsed Time: 00 00:00:08.191 (8.191 Seconds)

    --

    316,746,237 -m8

    Compression Time : 11.46s ==> 87.3MB/s
    Total Time : 11.94s ( HDD Read : 0.32s / HDD Write : 0.16s / CPU : 11.46s )
    Elapsed Time: 00 00:00:11.970 (11.970 Seconds)

    Decoding Time : 7.94s ==> 125MB/s
    Total Time : 8.61s ( HDD Read : 0.17s // HDD Write : 0.36s )
    Elapsed Time: 00 00:00:08.632 (8.632 Seconds)

    ---

    315,896,873 -m16

    Compression Time : 11.23s ==> 89.0MB/s
    Total Time : 11.91s ( HDD Read : 0.48s / HDD Write : 0.16s / CPU : 11.23s )
    Elapsed Time: 00 00:00:11.932 (11.932 Seconds)

    Decoding Time : 7.50s ==> 133MB/s
    Total Time : 8.19s ( HDD Read : 0.16s // HDD Write : 0.47s )
    Elapsed Time: 00 00:00:08.217 (8.217 Seconds)

    ---

    315,409,228 -m32

    Compression Time : 11.39s ==> 87.8MB/s
    Total Time : 11.98s ( HDD Read : 0.38s / HDD Write : 0.22s / CPU : 11.39s )
    Elapsed Time: 00 00:00:11.999 (11.999 Seconds)

    Decoding Time : 8.01s ==> 124MB/s
    Total Time : 8.58s ( HDD Read : 0.06s // HDD Write : 0.47s )
    Elapsed Time: 00 00:00:08.592 (8.592 Seconds)

    ---

    315,128,666 -m64

    Compression Time : 11.84s ==> 84.4MB/s
    Total Time : 12.64s ( HDD Read : 0.64s / HDD Write : 0.11s / CPU : 11.84s )
    Elapsed Time: 00 00:00:12.665 (12.665 Seconds)

    Decoding Time : 7.61s ==> 131MB/s
    Total Time : 8.28s ( HDD Read : 0.13s // HDD Write : 0.45s )
    Elapsed Time: 00 00:00:08.318 (8.318 Seconds)

    ---

    314,964,910 -m128

    Compression Time : 11.47s ==> 87.2MB/s
    Total Time : 12.00s ( HDD Read : 0.39s / HDD Write : 0.11s / CPU : 11.47s )
    Elapsed Time: 00 00:00:12.038 (12.038 Seconds)

    Decoding Time : 7.67s ==> 130MB/s
    Total Time : 8.27s ( HDD Read : 0.17s // HDD Write : 0.41s )
    Elapsed Time: 00 00:00:08.289 (8.289 Seconds)

    ---

    314,865,683 -m256

    Compression Time : 11.36s ==> 88.1MB/s
    Total Time : 12.19s ( HDD Read : 0.54s / HDD Write : 0.28s / CPU : 11.36s )
    Elapsed Time: 00 00:00:12.235 (12.235 Seconds)

    Decoding Time : 7.75s ==> 129MB/s
    Total Time : 8.36s ( HDD Read : 0.14s // HDD Write : 0.45s )
    Elapsed Time: 00 00:00:08.399 (8.399 Seconds)

    ---

    314,816,813 -m512

    Compression Time : 11.33s ==> 88.3MB/s
    Total Time : 12.09s ( HDD Read : 0.54s / HDD Write : 0.22s / CPU : 11.33s )
    Elapsed Time: 00 00:00:12.158 (12.158 Seconds)

    Decoding Time : 7.58s ==> 131MB/s
    Total Time : 8.33s ( HDD Read : 0.14s // HDD Write : 0.50s )
    Elapsed Time: 00 00:00:08.398 (8.398 Seconds)

    ---

    314,801,712 -m1024

    Compression Time : 11.39s ==> 87.8MB/s
    Total Time : 12.30s ( HDD Read : 0.56s / HDD Write : 0.16s / CPU : 11.39s )
    Elapsed Time: 00 00:00:12.400 (12.400 Seconds)

    Decoding Time : 7.96s ==> 125MB/s
    Total Time : 8.64s ( HDD Read : 0.13s // HDD Write : 0.41s )
    Elapsed Time: 00 00:00:08.763 (8.763 Seconds)

    ---

    314,801,712 -m1536

    Compression Time : 11.20s ==> 89.3MB/s
    Total Time : 12.22s ( HDD Read : 0.91s / HDD Write : 0.09s / CPU : 11.20s )
    Elapsed Time: 00 00:00:12.307 (12.307 Seconds)

    Decoding Time : 8.41s ==> 118MB/s
    Total Time : 9.00s ( HDD Read : 0.13s // HDD Write : 0.39s )
    Elapsed Time: 00 00:00:09.093 (9.093 Seconds)

    ---

    314,801,712 -m1750

    Compression Time : 11.36s ==> 88.0MB/s
    Total Time : 12.23s ( HDD Read : 0.70s / HDD Write : 0.17s / CPU : 11.36s )
    Elapsed Time: 00 00:00:12.336 (12.336 Seconds)

    Decoding Time : 7.73s ==> 129MB/s
    Total Time : 8.55s ( HDD Read : 0.22s // HDD Write : 0.47s )
    Elapsed Time: 00 00:00:08.650 (8.650 Seconds)

    ---

    314,801,712 -m2048

    Compression Time : 11.26s ==> 88.8MB/s
    Total Time : 12.19s ( HDD Read : 0.66s / HDD Write : 0.25s / CPU : 11.26s )
    Elapsed Time: 00 00:00:12.303 (12.303 Seconds)

    Decoding Time : 8.19s ==> 122MB/s
    Total Time : 8.81s ( HDD Read : 0.11s // HDD Write : 0.37s )
    Elapsed Time: 00 00:00:08.908 (8.908 Seconds)

  20. #20
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Wow, Sportman, such a RAID system; i think i've never seen anything like this before.
    Well, with such bad-ass specs, I/O is no longer a problem.
    And indeed, your results are very regular and consistent.

    So eventually, you prove there is an effect of memory size, albeit not a large one :
    Compression --> from 11.7 to 12.3
    Decomp. --> from 8.0 to 8.7
    with occasional random noise on top of that.
    That seems pretty consistent.

    You also mention an error when requiring too much memory, as the proper behavior in this example should be to size down to FileSize, which is the smaller value.
    Note also that, in the case that both required memory and filesize are too big for available memory, it should nicely exit, with a clear error message.

    This will be corrected in beta 2 Thanks for pointing that one.

    Osman : Yes , i fully agree, i will probably size down default memory to 64MB for final release, given the current results.

    Bulat : Sure, i agree, I/O optimization is completely different from compression algorithm, and RAM disk would help decrease I/O randonmess measurement. Now, there is just one little thing with this methodology : it costs RAM . And i'm not that plentyfull yet...
    Last edited by Cyan; 2nd April 2010 at 20:07.

  21. #21
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    788
    Thanks
    64
    Thanked 274 Times in 192 Posts
    I have redo the test at that first older quicker CPU QX9650 4,0GHz but instead of single hard disk now 3 x SSD disks in RAID0 for read and write.

    Maybe there was a terminal problem when running so many tests behind each other, so I have put this time a sleep between every part of the test to let the CPU cool down.

    Beta 1 results enwik9:

    324,853,068 bytes 7.828 sec 5.821 sec -m1
    320,701,441 bytes 7.724 sec 5.786 sec -m2
    318,215,057 bytes 8.113 sec 5.872 sec -m4
    316,746,237 bytes 8.187 sec 5.917 sec -m8
    315,896,873 bytes 8.354 sec 5.703 sec -m16
    315,409,228 bytes 8.387 sec 5.853 sec -m32
    315,128,666 bytes 8.397 sec 5.814 sec -m64
    314,964,910 bytes 8.366 sec 5.765 sec -m128
    314,865,683 bytes 8.336 sec 6.077 sec -m256
    314,816,813 bytes 8.339 sec 8.760 sec -m512
    314,801,712 bytes 8.657 sec 8.845 sec -m1024
    314,801,712 bytes 8.679 sec 8.636 sec -m1536
    314,801,712 bytes 8.571 sec 8.660 sec -m1750
    314,801,712 bytes 8.671 sec 8.818 sec -m2048

  22. #22
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Thanks for testing, Sportman.
    Indeed, these are the fastest speed i've ever seen to compress a full GB of enwik9.

    So now this is very clear, there is indeed an impact on speed with increased memory, albeit not a too large one.

    Now, regarding decoding, there is a much larger gap, when moving beyond the 128MB mark. As this gap was not present on your previous platform, i guess there is probably an external reason for this. Maybe a difference in the amount of available memory for each system ?

    Anyway, it points towards a smaller default memory size, such as in the 64 or 128MB range.

  23. #23
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Please find in below link the beta 2 version of etincelle.

    It corrects the bug discovered by Sportman regarding too large memory selected (now error message is clearer, and resizing ensure it doesn't happen for small files).
    There is also a very small gain in compression rate and speed.

    http://sd-1.archive-host.com/membres...elle-beta2.zip

    While at it, i've also updated my graphical comparison benchmark of enwik8, extending the time-range to 3 seconds. There are many more compressors tested now.
    http://phantasie.tonempire.net/pc-co...rk-t96.htm#149

    Regards

  24. #24
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    788
    Thanks
    64
    Thanked 274 Times in 192 Posts
    This test system E5530 2.4GHz (2.53GHz turbo) has 24GB memory, the other test system QX9650 4,0GHz only 2GB memory.

    Beta 2 results enwik9:

    Output size Comp time Decomp time
    325,994,162 bytes 11.705 sec 7.960 sec -m1
    320,919,785 bytes 11.789 sec 8.010 sec -m2
    318,206,223 bytes 11.928 sec 8.183 sec -m4
    316,670,404 bytes 11.938 sec 8.642 sec -m8
    315,799,526 bytes 11.927 sec 8.422 sec -m16
    315,304,742 bytes 12.003 sec 8.859 sec -m32
    315,021,032 bytes 11.996 sec 8.380 sec -m64
    314,856,150 bytes 12.097 sec 8.424 sec -m128
    314,755,904 bytes 12.783 sec 8.449 sec -m256
    314,706,425 bytes 12.193 sec 8.570 sec -m512
    314,691,286 bytes 12.473 sec 8.829 sec -m1024
    314,691,286 bytes 12.361 sec 8.827 sec -m1536
    314,691,286 bytes 12.271 sec 8.858 sec -m1750
    314,691,286 bytes 12.399 sec 9.216 sec -m2048
    314,691,286 bytes 12.258 sec 8.751 sec -m4096
    314,691,286 bytes 12.446 sec 8.717 sec -m8192
    314,691,286 bytes 13.047 sec 8.743 sec -m16384

    Higher memory settings give output in beta 2.
    Last edited by Sportman; 7th April 2010 at 05:09.

  25. #25
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Thanks for the test.
    Indeed, on the Nehalem system, there is no gap in decoding time with increased memory, just a regular slowdown as expected. Strange, i fail to find a reason for this behavior difference. Maybe OS version ?

    Anyway, the next version, beta 3, will integrate "uncompressible section detection" into the main code branch, as discussed in an earlier thread. Given current preliminary results, expect some pretty sensible speed improvements to come.

    Regards

  26. #26
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Hi

    Beta3 is ready for testing, featuring some better than expected results. As it stands, the new segment detection algorithm makes wonders.

    It can be downloaded here :
    http://sd-1.archive-host.com/membres...elle-beta3.zip

    Etincelle is able to find small segments of incompressible data within any file and remove them on the fly from the compression loop.
    What's more, this detection comes at no perceptible cost for files which does not contain such segment.

    Let's make some test to verify this claim :
    Enwik8
    beta2 : 35.76% - 97.1 MB/s - 132 MB/s
    beta3 : 35.76% - 98.6 MB/s - 134 MB/s

    That was my first worry : would the detector slow down files for which it is useless, such as enwik8, where no segment can qualify as "incompressible" ?
    Apparently not. There is even a very small speed boost, due to other minor improvements (removed inefficiencies).

    Let's take the other extreme, and try to compress an already compressed file :
    Enwik8.7z
    beta2 : 100.00% - 51 MB/s - 186 MB/s
    beta3 : 100.00% - 430 MB/s - 1000 MB/s

    Now that's talking, and it provides a usefull reference for "wall time", when ideal detection conditions are present.


    Making life a little harder, we are going to test a more difficult sample,
    a sound.wav file, which is neither really compressible, neither completely incompressible. No filter is provided, this is direct LZ compression algorithm.
    Sound.Wav
    beta2 : 90.90% - 50 MB/s - 125 MB/s
    beta3 : 90.91% - 150 MB/s - 250 MB/s

    Not bad at all; the minor compression loss is more than offset by speed gains. Indeed, there was a real risk in this situation that the detector would either too rarely trigger, or would miss too many hits. Apparently, it succeeds nicely at keeping a good ratio.


    Now let's deal with real life examples.
    One torture test i had in mind since the beginning was a Win98.vmdk virtual HDD. It is a ~300MB file, within which a fair part (about 20%) consists of CAB files (microsoft compressed cabinets) somewhat scattered within virtual segments.
    Now this is a particularly difficult situation, no clear file separation, no extension to help detection, not even guarantee that files are written in a single continuous location (they can be scattered between several virtual sectors).

    This is exactly where automatic segment detection can have an impact :

    Win98.vmdk
    beta2 : 64.95% - 67 MB/s - 176 MB/s
    beta3 : 63.86% - 100 MB/s - 250 MB/s

    Now that seem correct. Speed improvement is noticeable.
    But wait, that's not all, hasn't the compression rate improved also ?
    Yes, it has.
    The reason for this gain is that, with incompressible segments skipped, the table is less clobbered with useless "noise" pointers, therefore improving compression opportunities for future data.
    This is the nice bonus of this strategy.

    Skipping data, however, also have its bad effect for compression, such as skipping too much, or not providing enough pointers for future data. As a counter-example, let's compress the "Vampire" game directory.

    Vampire
    beta2 : 41.13% - 80 MB/s - 185 MB/s
    beta3 : 41.43% - 90 MB/s - 200 MB/s

    This time it has not worked so well. 0.3% ratio lost. Keeping the table clean was not enough to offset too much data skipped and lessened opportunity to find new matches.

    Vampire is, however, quite unusual.
    In the vast majority of circumstances, the compression difference is very minor (in the range of 0.05%), and more likely a gain than a loss.
    Speedwise however, this is always a gain. Even small segments lost in a large container do provide their share of speed boost.

    As a last exemple, let's try an Ubuntu virtual HDD File, as proposed by Bulat for his HFCB benchmark :

    VM.DLL
    beta2 : 32.15% - 90 MB/s - 195 MB/s
    beta3 : 32.13% -100 MB/s - 210 MB/s

    This example is typical of what you'll get in most circumstances with many large files. So sounds like a good addition to a fast compressor.

    Best Regards
    Last edited by Cyan; 8th April 2010 at 01:47.

  27. #27
    Member Fu Siyuan's Avatar
    Join Date
    Apr 2009
    Location
    Mountain View, CA, US
    Posts
    176
    Thanks
    10
    Thanked 17 Times in 2 Posts
    Hi Cyan , can you provide some information about your final design of "the new segment detection algorithm" which I am very interested in?

  28. #28
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    325
    Thanks
    18
    Thanked 6 Times in 5 Posts
    That's looking like a very nice and exciting addition Cyan, can't wait to see what more tricks you have up your sleeve

  29. #29
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    788
    Thanks
    64
    Thanked 274 Times in 192 Posts
    Beta 3 results enwik9:

    Output size Comp time Decomp time
    325,994,237 bytes 11.451 sec 8.531 sec -m1
    320,919,693 bytes 11.418 sec 8.741 sec -m2
    318,205,983 bytes 11.480 sec 8.648 sec -m4
    316,670,206 bytes 11.847 sec 8.751 sec -m8
    315,799,291 bytes 11.770 sec 9.354 sec -m16
    315,304,454 bytes 12.493 sec 8.798 sec -m32
    315,020,644 bytes 11.760 sec 8.899 sec -m64
    314,855,740 bytes 12.563 sec 8.859 sec -m128
    314,755,446 bytes 11.890 sec 8.934 sec -m256
    314,705,987 bytes 11.908 sec 9.430 sec -m512
    314,690,849 bytes 12.108 sec 9.429 sec -m1024
    314,690,849 bytes 12.029 sec 9.486 sec -m1536
    314,690,849 bytes 12.469 sec 9.216 sec -m1750
    314,690,849 bytes 12.696 sec 9.217 sec -m2048
    314,690,849 bytes 12.112 sec 9.589 sec -m4096
    314,690,849 bytes 12.864 sec 9.287 sec -m8192
    314,690,849 bytes 12.111 sec 9.190 sec -m16384

    Test system: E5530 2.4GHz (2.53GHz turbo)

  30. #30
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Thanks Sportman.

    Enwik9 is one of those file which does not benefit from "incompressible segment detection", as no part of enwik9 qualify as "incompressible".

    However, it is a good example to look at impact from modifications.

    It seems that, if your system hasn't changed, there is a slight increase at compression speed (around 0.3s better) and a slight decrease at decompression speed (around 0.5s worse).
    I'm surprised because i'm not seeing the same (differences are much smaller on my pc). Then, our systems are a bit different (my own rig is a core 2 duo), and maybe we are not measuring the same thing (i'm comparing CPU time, while you compare total time using a very very fast i/o system).

    Anyway, maybe another area of interest could be binary files, especially those with some difficult parts into it. I expect you will witness a larger difference between beta2 and beta3.

    can't wait to see what more tricks you have up your sleeve
    There is, Intrinsic, there is

    @Fu : i intend to write a short article on how it works and what benefits come from it. stay tuned.

Page 1 of 3 123 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •