Okay, here is the very special version of the TC file compressor. This one has a huge dictionary (512 MB). I mainly made this version for the Squeeze Chart 2007...
Enjoy!
Link:
tc-5.1dev7x.zip (41 KB)
![]()
Okay, here is the very special version of the TC file compressor. This one has a huge dictionary (512 MB). I mainly made this version for the Squeeze Chart 2007...
Enjoy!
Link:
tc-5.1dev7x.zip (41 KB)
![]()
That is awesome!!!Can't wait to try this one myself!
Thanks Ilia!
Very quick first test:
A10.jpg > 830,453
AcroRd32.exe > 1,305,803
english.dic > 825,510
FlashMX.pdf > 3,694,659
FP.log > 585,245
MSO97.dll > 1,712,306
ohs.doc > 783,943
rafale.bmp > 974,423
vcfiu.hlp > 600,208
world95.txt > 579,138
Total = 11,891,688 Bytes
On small files (<64 MB), performance will be the same. This version has some advantage on large files! Malcolm Taylor describes the ROLZ algorithm as a fast large dictionary LZ. Indeed, this technique can cover a large dictionaries, both fast and memory efficiently. For example, a standard LZ technique requires about:
dictionary_size*10
The ROLZ can cover a really rage dictionaries with a small memory footprint. For standard LZ, just try to multiply 510 MB * 10 or 12...![]()
True. Not even a byte difference from dev7. But there is a small speed drop (about 25kB/s).Originally Posted by encode
Are you going to compare it with dev7 on some very large (1 Gig +) files?Originally Posted by Black_Fox
I will test it myself over the next few days.
Yep, I am... don't know yet, what files, but I will find some... maybe 80 milions 'a' letters text-file for beginning![]()
One important thing about Squeeze Chart - it has many large sets with very similar files - DriversXP, FreeDB, etc.
I'm also testing this new TC for how really far this engine can look. For example, I make a TAR with a large number of copies of the one large file (>20 MB) and see the result - if LZ engine is able to find the previous copy inside this TAR file - the compression becomes awesome!
![]()
Why do you not create a Compressor with realy big Dictionary(512mb or bigger) and using ROLZ. When ROLZ used only 2*Dictionary?
TC 5.1dev7x has 512 MB dictionary and uses ROLZ-like algorithm. Also, for ROLZ, memory usage of 2*dictionary is not mandatory. For example, in my implementations ROLZ uses dictionary+16 MB...Originally Posted by thometal
![]()
Quick test with ENWIK8...
TC 5.1 dev7 > 27,934,960 bytes
TC 5.1 dev7x > 27,888,899 bytes
Your realization rolz uses not 512 mb the dictionary. It is the size of the block. It something is similar on LZRW. Under the dictionary the last are remembered N values for current active contexts.
The test: coll.tar (1files (12mb) 2files (20mb) 1files (12mb))
7-zip (16mb) 43.7mb
7-zip (64mb) 31.8mb
tc 5.1dev7x 43.5mb
P.s Rolz detect this and compress up to 31.2 mb
I guess I know how my program works...
Actually, the real size of how far my ROLZ can look is depends on data type. For example, with hard compressible data, this distance is smallest, and with good compressible data, this distance is the largest. Theoretically, the whole dictionary can be covered. However, in practice, the distance is smaller. The approximate values:
16...32 MB with already compressed data
32...64 MB with the common data
64...512 MB with the highly compressible data
![]()