1. Thanks Ilia!

2. 10x! my mulptiply hash is rather close to FNV hash mentioned but not tested there: http://isthe.com/chongo/tech/comp/fnv/

3. Very nice article ! Thanks !

4. Thanks Ilia!

5. Probably, in new LZPM Ill replace current multiplicative hashing with modular hashing.

MOD hashing is very simple:

key % PRIME

In my case, key = (*(reinterpret_cast<__int64 *>(&buf[i])) & 0xffffffffff)

Originally Posted by lzpm.cpp
#define HASHSIZE 4194301
#define gethash(i) ((*(reinterpret_cast<__int64 *>(&buf[i]))
& 0xffffffffff) % HASHSIZE)
I think such hash function should be useful for 5-byte strings.

PRIME values:
for 20-bit hash PRIME = 1048573
for 21-bit hash PRIME = 2097143
for 22-bit hash PRIME = 4194301
for 23-bit hash PRIME = 8388593
for 24-bit hash PRIME = 16777213

The complete table of values can be found here:
http://gcc.gnu.org/ml/gcc-patches/2001-03/msg01075 .html

6. A 64-bit PRIME that can be used with multiplicative hashing:
11400714819323198483
or
0x9e3779b97f4a7c13 (hex)

Modular hashing failed. LZPM works 2-times slower (on text files) with such technique.

7. Modular hashing is slow because division is slow. g++ will optimize a/b to the form a*((1<<n)/b)>>n when b is constant. Then it will optimize a%b to a-((a/b)*b). Still this is a lot of instructions.

A simpler hash for a 32 bit input is to multiply by some large odd number (does not have to be prime) and use the high bits. Each bit depends on the bits to the right, but not the bits to the left. A better hash is:

x *= C1;
x = x<<16|x>>16;
x *= C2;

where C1 and C2 are odd. This also has the nice property that there are no collisions because each step has an inverse. Also, g++ will optimize the second statement to a single rotate instruction.

8. imo almost every modern compiler uses multiplying by reciprocal when dealing with diviosion by constant.

so entire code sequence for modulo would look like that:

mov ebx,[number]
mov eax,reciprocal
mul ebx ; after that a / b is in edx
mov eax,edx
mul ebx
sub ebx,eax ; after that a % b is in ebx
of course there should be also some incs for corrections (as everything is in integer precision), but finally it still be very fast.

details on how to use multiplying by reciprocal are at http://agner.org/optimize/

in my tarsalzp is use fnv hash. entire code sequence to compute fnv- 1 hash of last four bytes is:

mov eax,2166136261
mov ebx,16777619
mul ebx
xor al,[edi-01]
mul ebx
xor al,[edi-02]
mul ebx
xor al,[edi-03]
mul ebx
xor al,[edi-04]
and eax,1 shl 22 - 1
; and we have fnv- 1 hash in eax
much slower than matts, multiplicative or modulo hashing but still very fast. in my tarsalzp im computing fnv- 1 hash for the last eight bytes sequence for every position in file and i think that computing such hashes consumes about 20 seconds on matts system when dealing with enwik9, so its resource requirements are negligible.

i think that good bit dispersion is more important as it would provide less collisions. lzpm can also use hash collision detection, eg. use highest 5 bits of fnv- 1 hash for collision detection.

maybe ilia is computing hashes too many times unnecessary wasting time. memoizing hashes would be a good technique is such case. or maybe its necessary (i doubt, but i cant tell without source code) and computing hashes consumes most of the time.

ilia: please share the code of lzpm

ive looked at quad sources. i noticed that you use 32- bit for every rolz context: 8 bits for detecting hash collisions of first four bytes and 24- bit for pointers. with good hash (when collisions are rare) you can use this 8- bit counter to store length of longest prefix with previous entry in table accelerating speed many times, especially on very reduntant files (like fp.log).

maybe ill play with such technique later. today im busy - i have an exam ;(

9. Originally Posted by Matt Mahoney
x *= C1;
x = x<<16|x>>16;
x *= C2;
Originally Posted by lpaq1.cpp
i*=123456791;
i=i<<16|i>>16; // use _rotr(i, 16) or _rotl(i, 16) instead?
i*=234567891;
int chk=i>>24;
Just tested this function. Well, with LZPM this one gives no improvement.

Currently I use:
Originally Posted by lzpm.cpp
#define HASHBITS 24
#define HASHSIZE (1 << HASHBITS)

#define gethash(i) ((*(reinterpret_cast<unsigned int *>(&buf[i - 1]))
* 123456791) >> (32 - HASHBITS))
Like you see I hash 32-bits to 24-bits.

Originally Posted by donkey7
maybe ilia is computing hashes too many times unnecessary wasting time. memoizing hashes would be a good technique is such case. or maybe its necessary (i doubt, but i cant tell without source code) and computing hashes consumes most of the time.
Actually hash is not so important with LZPM, I tried many, found the best. The catch in string searching (memory access). With max level LZPM tries all available matches, overwise we loose compression. With such mode I desperate for each byte. However, since we have a few compression levels, I even dont care by now. I will try to improve the existing code anyway.

Originally Posted by donkey7
ilia: please share the code of lzpm
Just currently I see no reason to do that - the code is not finished yet. Only a small amount of people will understand whats going on inside LZPM. In addition, LZPM is just an improved version of QUAD - which is open source. To prove that, let me introduce a part of the LZPM decoder v0.11 (look at its size!!):
Originally Posted by lzpm.cpp
static void decode() {
static int tab[256][TABSIZE];
int cnt[256];

__int64 size = 0;
if (size < 0)
fprintf(stderr, "size error"), exit(1);

ppm.decoder.Init();

while (--size >= 0) {
buf[0] = ppm.Decode(0);
int i = 1;

while ((i < N) && (--size >= 0)) {
int c = buf[i - 1];
int j = i;

int s = ppm.Decode(c);
if (s >= 256) {
size -= ((s -= (256 - MINMATCH)) - 1);

int p = tab[c][(s > MINMATCH
? cnt[c] - ppm.DecodeIndex() : cnt[c]) & (TABSIZE - 1)];
while (--s >= 0)
buf[i++] = buf[p++];
}
else
buf[i++] = s;

tab[c][++cnt[c] & (TABSIZE - 1)] = j;
}

fwrite(&buf, 1, i, output);
}
}
Does anyone sent me an improved version of QUAD which will be at least near to the LZPM in terms of compression ratio or decompression speed? Nope. I am the author and Im moving the shadow!

10. Actually memcpy() can be faster than loop:
Originally Posted by encode
while (--s >= 0)
buf[i++] = buf[p++];
->
Originally Posted by encode
memcpy(&buf[i], &buf[p], s); // DEBUG
i += s;
But due to some reason here memcpy() (and memmove()) works not correctly. Finding whats wrong...

The catch in overlaps - the buffer must be updated during match copying...

11. Memcpy will not be able to never work in how much is not in a position to copying reconstructing the structure!
My solution:
void fast_copy(unsigned char *src,unsigned char *dst,long len)
{while (len--) *dst++=*src++;}

12. Originally Posted by Nania Francesco Antonio
void fast_copy(unsigned char *src,unsigned char *dst,long len)
{while (len--) *dst++=*src++;}
I believe that a good compiler perform this automatically.

13. Try something like this:

pushfd

mov ecx,LEN
and ecx,ecx
jz @EXIT

lea esi,SRC
lea edi,DST
cld
rep movsb

popfd

Regards,

14. With VC2005 we can do the same thing with:
__movsb(&buf[i], &buf[p], s);
i += s;
Will make some tests to see what it gives. However, LZMA uses the approach I currently use.

15. Author claims his hash function to be the fastest, i can't check it
http://www.azillionmonkeys.com/qed/hash.html

16. The last hash code are not good in performance ! (tested)

17. Nania Francesco Antonio
a simple hash = value & mask could also serve as hash and it would be fastest hash possible but many algorithms require low number of collisions.

for example i'm using fnv hash in my tarsalzp despite it has relatively low performance. but if i change it to something simpler and faster i can gain only few percent of speed for compression ratio loss of few percent.

and fnv hash is commonly used

18. Originally Posted by nimdamsk
http://www.azillionmonkeys.com/qed/hash.html
it seems quite useless because its oriented to hashing large strings while only ~4bytes are hashed in our compressors

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•