From the MC guestbook...
Mirror #1: THOR_094.zip (68 KB)Originally Posted by Oscar Garcia
Mirror #2: Download THOR
From the MC guestbook...
Mirror #1: THOR_094.zip (68 KB)Originally Posted by Oscar Garcia
Mirror #2: Download THOR
2x faster with the same compression level as tornado![]()
![]()
Compression level isn't top notch, but it's fast as hell !![]()
Bulat, I think you should request the source of THOR! Since the author appears...![]()
Yepp, THOR was written using Delphi 6.0 - 7.0. What if I'll rewrite LZPM using Delphi??![]()
Reminds me of times when I had even less experience than now and thought, that console applications are written in Pascal and GUI in C++Originally Posted by encode
![]()
encode
the main source of slowness nowadays is amount of cache misses. i know every cache miss for my algorithm and there is no difference will it be written in delphi, asm or ? (neverteless, i can say that delphi has much worser optimization than any modern c++ compiler)
lzp compressors, even as written 10 years ago, are good performers. probably, by adding new ideas, LZP method allows much better speed/compression than LZ77. and i flatter myself with hope that new thor was improved using tor or quad ideas
i don't see much possibilities to improve tor in fast modes (-1 to -4), so i should say that THOR is definitely the best. my program has its advantages but main goal - outperform thor using my huge lz77 experience - was not reached. it will be interesting to try LZP too, but i think that it will require much more time (i'm pretty ignorant in this area) and not so important for freearc users - even with 2x less speed than thor, tor algorithm will be enough for fast freearc modes
Yes, but it is worth to try! Probably I'll get the better I/O performance.![]()
about i/o - there are few std ideas that you may use
1) don't use fgetc/fputc. make your own buffer instead:
inline putc(c)
{
*curptr++ = c;
if curptr >= bufend
-- write
}
2. if you really need fast i/o - use 3 threads - for read, compress and write. read/write data in large chunks. use circular buffers. preread large enough amount of data (for example, for quad what compress data in 16mb blocks, you should use 4*16mb preread buffers and at least 2*16mb output buffers)
It's well known thing...
I must admit that Pascal/Delphi has some sort of magic. It's easy. It has all functions for easy app development. Easy to read, easy to understand...![]()
have you tried using win32's createfile api with asynchronous mode, sequential scan flag set and disabled internal (by system) file buffering?
i think it should be faster, because asynchronous mode allows you to perform i/ o operations and (de) compress files at the same time.
it's the same as using separate threads for i/o. in haskell, my way is simpler (and i tested that it really allows 100% overlapping of i/o and compression), in C your way may be simpler
the only difference is that i don't disable buffering. my way is already implemented in freearc, you can try it using fast mode (say, -m2) and large amount of files
yes, its great for application programs development (see juliet-prg.narod.ru), but doesnt give any automatic speed improvements. our algorithms is the problematic part, not our compilers/laguagesOriginally Posted by encode
![]()
Thanks Oscar!
I invite you to join us here at the forum.![]()
Thaks LovePimple!Originally Posted by LovePimple
Youre right, encode.Originally Posted by encode
Youve got quite a nice site here.
Your words honour you, Bulat. But dont give up, we must stress those new Opterons to the limitOriginally Posted by Bulat Ziganshin
![]()
Good to have you on board Oscar!Excellent work with the latest version of THOR!
Could you please explain what changes you have made since the previous version?
Oscar, i'm glad to see you here
about tornado - i work on next version, but it has minor improvements in fast modes. unfortunately, for good program one need a large amount of time, even if he has good potential. i developed tor mainly for freearc and 7zip and Igor said that he don't want to use it - so only for my own archiver
i have already developed and adapted several small algorithms to improve text, wave, executable compression in freearc (www.haskell.org/bz), tornado become one more algorithm - for fast mode. i can spend 2-4 weeks on it and then prefer to go into polishing other aspects of archiver while for you thor is only (compression) algorithm
if some fast lzh compressor will be available in sources, i will prefer to use it and don't reinvent the wheel. if you will make thor sources available on lgpl license, i will probably reimplement them in C++ and include in my program. otherwise, thor will remain best program in this area, while tor will remain only one available with souces
I notice that you still havent fixed the bug that was reported several months ago by Mark Cramer.
My test...Originally Posted by Mark Cramer
-=[ THOR ]=- v0.94 alpha Oscar Garcia
WARNING! Evaluation version. May contain bugs. Use only for TESTING.
- Compressing... 100%%
- Done.
FILES: 1
SIZE: -1884254208 bytes
COMPRESSED: 1366538540 bytes
RATIO: UNDEFINED
TIME: 122sec 734msec
SPEED: UNDEFINED
Kernel Time = 15.031 = 00:00:15.031 = 12%
User Time = 38.843 = 00:00:38.843 = 31%
Process Time = 53.875 = 00:00:53.875 = 43%
Global Time = 122.828 = 00:02:02.828 = 100%
The size of the test file was only 2.24 GB (2,410,713,088 bytes).
Any chance you can fix this bug for the next version?
LovePimple
read:Originally Posted by LovePimple
![]()
Does contain a bug!![]()
It's not a bug. It's an integer limitation...![]()
Well, delphi uses type LongWord, which is unsigned integer, so the limitation could be 4GB then...Originally Posted by encode
And then there is Int64...
Generic integer types for 32-bit implementations of Delphi
Type Range Format
Integer -2147483648..2147483647 signed 32-bit
Cardinal 0..4294967295 unsigned 32-bit
Fundamental integer types include Shortint, Smallint, Longint, Int64, Byte, Word, and Longword.
Fundamental integer types
Type Range Format
Shortint -128..127 signed 8-bit
Smallint -32768..32767 signed 16-bit
Longint -2147483648..2147483647 signed 32-bit
Int64 -2^63..2^63-1 signed 64-bit
Byte 0..255 unsigned 8-bit
Word 0..65535 unsigned 16-bit
Longword 0..4294967295 unsigned 32-bit
Most likely he uses Integer.
Or this is just writeln bug!
Yup, I use cardinal and today I discovered longword... learn something every day![]()
thor benchmarked (also tornado, quicklz, lzpm, m99).
http://cs.fit.edu/~mmahoney/compression/text.html
actually, quicklz-0 should be fastest compressor in your test. tornado has -1 to -12 predefined options and thor 0.94 works better than previous version so it has meaning to retest ef/e/ex tooOriginally Posted by Matt Mahoney
Hi all,
This one should handle huge files (>2GB):Originally Posted by LovePimple
http://rapidshare.com/files/27449609/THOR_094.zip. html
It was longintOriginally Posted by encode
![]()
Thanks, MattOriginally Posted by Matt Mahoney
Thank you!Originally Posted by Oscar
![]()
Thanks Oscar!Originally Posted by Oscar
Works like a charm!
Mirror: Download THOR
By the way, I have my own fast LZ! It compresses enwik8 within 2 sec! (looks like I/O bounded). If anyone will be interested I can release it.
It uses:
+ My own LZP
+ Byte aligned I/O
+ Has Optimal Parsing
![]()