-
The Founder
Okay, new version of TC is out.
What's new/new algorithm configuration:
+ Now for coding literals and match lengths [!] TC uses order-3-1-0 PPMC
+ Improved hashtable and frequency counting
+ Reduced memory usage. Now TC uses just 20 MB (8 MB PPM hashtable, 4 MB LZP index table, 8 MB buffer)
Enjoy!
Link:
Download TC 5.0dev7 (30 KB)
-
-
The Founder
TC 5.0dev7 on Large Text Compression Benchmark:
ENWIK8: 28,111,955 bytes
ENWIK9: 250,077,573 bytes (c 285 sec, d 325 sec)
Memory usage: 20 MB
P4 3.0 GHz, 1 GB RAM, Windows XP SP2
That means, now TC completely outperforms BZIP2!
-
-
The Founder
TC 5.0dev6 on Calgary Corpus (a tribute to compresison.ru staff):
bib: 28,774 bytes
book1: 249,186 bytes
book2: 168,648 bytes
geo: 61,831 bytes
news: 118,177 bytes
obj1: 10,510 bytes
obj2: 76,104 bytes
paper1: 17,112 bytes
paper2: 26,411 bytes
pic: 52,895 bytes
progc: 12,801 bytes
progl: 15,164 bytes
progp: 10,613 bytes
trans: 16,621 bytes
total: 864,847 bytes (2.2022 bpb)
-
-
-
-
The Founder
-
-
The Founder
TC 5.0dev7 on SFC (maximumcompression.com):
A10.jpg: 853,308 bytes
acrord32.exe: 1,706,268 bytes
english.dic: 860,544 bytes
FlashMX.pdf: 3,764,619 bytes
fp.log: 626,490 bytes [!]
mso97.dll: 2,071,456 bytes
ohs.doc: 857,063 bytes
rafale.bmp: 1,035,898 bytes
vcfiu.hlp: 727,511 bytes
world95.txt: 618,301 bytes
total: 13,121,458 bytes
-
-
-
-
TC 5.0 dev7 shows big improvements in both SFC and MFC tests.
-
-
The Founder
Note, order-3-1-0 PPMC is powerful algorithm, and even without any LZP-layers can provide serious compression. I consider, current LZP scheme is not optimal and in some cases can hurt compression. It's due to current TC uses scheme implemented for lower orders of context modeling - such as order-1. Such lower orders cannot provide any serious compression and LZP in this case plays the prime role. With PPMC, the task of LZP is different - it must efficiently encode the long matches, but don't prevent efficient coding of short matches by PPMC algorithm. I think, in next version (5.0dev
I completely overwrite the entire algorithm, including LZP-layer - I again improve this layer, PPM hashtable - since I'll completely change literal/match length coding. From that I expect improved speed and compression. Also note, new TC must beat LZPXJ (even with superlarge memory footprint) and UHARC-LZP. So this version must become something middle between UHARC-LZP and UHARC-PPM.
-
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules