# Thread: Fast CRC table construction and rolling CRC hash calculation

1. ## Fast CRC table construction and rolling CRC hash calculation

Inspired by Eugene Shelwien's fma-diff rolling CRC implementation, i've developed public domain implementation of the same idea, based on Igor Pavlov 7-zip code. This includes rolling CRC hash and fast generation of tables for plain and rolling CRC hash. Plain CRC table constructed 30-100x faster than with original algorithm, and rolling CRC table construction is 250-500 times faster. Hopefully it will be useful for your programs.

```/* crc.c -- Fast CRC table construction and rolling CRC hash calculation
2009-11-23 : Igor Pavlov : Public domain
2013-03-27 : Bulat.Ziganshin@gmail.com : Public domain */

#include <stdio.h>

#define kCrcPoly     0xEDB88320
#define CRC_INIT_VAL 0 /* 0xFFFFFFFF for zip/rar/7-zip quasi-CRC */

// For rolling CRC hash demo
#define WINSIZE      100
#define TESTSIZE     200

// Fast CRC table construction algorithm
void FastTableBuild (unsigned CRCTable[256], unsigned seed)
{
unsigned i, j, r;

CRCTable[0]   = 0;
CRCTable[128] = r = seed;
for (i=64; i; i/=2)
CRCTable[i] = r = (r >> 1) ^ (kCrcPoly & ~((r & 1) - 1));

for (i=2; i<256; i*=2)
for (j=1; j<i; j++)
CRCTable[i+j] = CRCTable[i] ^ CRCTable[j];
}

#define init_CRC()                  CRC_INIT_VAL
#define update_CRC(crc,CRCTable,c)  (CRCTable[((crc)^(c)) & 0xFF] ^ ((crc)>>8))
#define finish_CRC(crc)             ((crc) ^ CRC_INIT_VAL)

unsigned calcCRC (unsigned char *buffer, unsigned len, unsigned CRCTable[256])
{
unsigned crc = init_CRC(), i;
for (i=0; i<len; i++)
crc = update_CRC(crc,CRCTable,buffer[i]);
return finish_CRC(crc);
}

int main()
{
unsigned i, j, r, CRCTab[256], FastCRCTab[256], RollingCRCTab[256], crc1, crc2;

// Fast CRC table construction
FastTableBuild (FastCRCTab, kCrcPoly);

// Classic CRC table construction algorithm
for (i=0; i<256; i++)
{
r = i;
for (j = 0; j < 8; j++)
r = (r >> 1) ^ (kCrcPoly & ~((r & 1) - 1));
CRCTab[i] = r;
if (FastCRCTab[i] != CRCTab[i])   // unit testing :)
printf("%02x %08x %08x\n", i, r, (r & 1) ? (r>>1)^kCrcPoly : (r>>1));
}

// Rolling CRC table construction
for (i=0; i<256; i++)
{
unsigned crc = init_CRC();
crc = update_CRC(crc,CRCTab,i);
for (j=0; j<WINSIZE; j++)
crc = update_CRC(crc,CRCTab,0);
RollingCRCTab[i] = finish_CRC(crc);
//    printf("+%02x %08x %08x\n",i,r, (r & 1) ? (r>>1)^kCrcPoly : (r>>1));
}

// Fast table construction for rolling CRC
FastTableBuild (FastCRCTab, RollingCRCTab[128]);
for (i=0; i<256; i++)
if (FastCRCTab[i] != RollingCRCTab[i])   // unit testing :)
printf("*%02x %08x %08x\n",i,FastCRCTab[i],RollingCRCTab[i]);

// Example of rolling CRC calculation and unit test simultaneously
unsigned char buffer[WINSIZE+TESTSIZE];
for (i=0; i<WINSIZE+TESTSIZE; i++)
buffer[i] = 11 + i*31 + i/17;   // random :)

// Let's calc CRC(buffer+TESTSIZE,WINSIZE) in two ways
crc1 = calcCRC(buffer+TESTSIZE,WINSIZE,CRCTab);
crc2 = calcCRC(buffer,WINSIZE,CRCTab);
for (i=0; i<TESTSIZE; i++)
crc2 = update_CRC(crc2,CRCTab,buffer[WINSIZE+i]) ^ RollingCRCTab[buffer[i]];
printf("%08x and %08x %s\n", crc1, crc2, crc1==crc2? "are equal":"ARE NOT EQUAL!");
return 0;
}```

Literature:
[1] http://www.xmailserver.org/rabin.pdf
[2] http://citeseer.ist.psu.edu/viewdoc/...10.1.1.53.6172
[3] http://www.drdobbs.com/parallel/fast...sing/229401411

2. ## Thanks (3):

schnaader (27th May 2018),Simorq (27th May 2018),SolidComp (30th May 2018)

3. At last, a good reason to use the new "twitter button" for mass spreading

By the way, here is the message we get when using it :
Fast CRC table construction and rolling CRC hash calculation http://encode.su/showpost.php?p=32727&postcount=1 via @YOUR-TWITTER-USERNAME
Maybe there is one parameter to change in order to replace @YOUR-TWITTER-USERNAME ....

4. Nice piece of work Bulat!! I am currently experimenting with rolling hashes and tried your implementation. Unfortunately it only worked with the #defined windows size of 100 and I was assuming it could be changed by simply modifying the preprocessor define.

My solution was to use another function to calculate the RollingCRCTab table like this:

Code:
```void FastTableBuild2 (unsigned CRCTable[256], unsigned FastCRCTable[256])
{
unsigned i,c,x,y;
for(c=0;c<256;c++)
{
x=0;
y=0;
x=CRCTable[c]^(x>>8);
for(i=0;i<WINSIZE;i++)
{
x=CRCTable[(unsigned char) x]^(x>>8);
y=CRCTable[(unsigned char) y]^(y>>8);
}
FastCRCTable[c]=x^y;
}
}```
Don't know if it is fast but it works. Maybe you can speed it up?

5. what you mean? the original code works ok for you? and then ou change only single line with WINSIZE define and it stops working? i ask you because i'm using this code in SREP with WINSIZE = 48 and trust me - it works

PS: it seems that your CRC_INIT_VAL isn't 0. it's why my code doesn't work for you. for a rolling hash you don't need to use non-zero CRC_INIT_VAL value. the only meaning of non-zero here is to differentiate between byte sequence XYZ and byte sequence 000XYZ that isn't required when all byte sequences are of the same size

6. Originally Posted by Bulat Ziganshin
what you mean? the original code works ok for you? and then ou change only single line with WINSIZE define and it stops working? i ask you because i'm using this code in SREP with WINSIZE = 48 and trust me - it works

PS: it seems that your CRC_INIT_VAL isn't 0. it's why my code doesn't work for you. for a rolling hash you don't need to use non-zero CRC_INIT_VAL value. the only meaning of non-zero here is to differentiate between byte sequence XYZ and byte sequence 000XYZ that isn't required when all byte sequences are of the same size
you are right. the original code works with windows sizes other than 100. i made the mistake of removing the "conventional" way of calculating the CRC and the rolling CRC tables. I thought they were only there for testing and comparing, but the call to FastTableBuild to build the rolling CRC table the fast way, requires that the rolling table is already existent:
Code:
`FastTableBuild (FastCRCTab, RollingCRCTab[128]);`
this is the code i ended up with:
Code:
```/* crc.c -- Fast CRC table construction and rolling CRC hash calculation
2009-11-23 : Igor Pavlov : Public domain
2013-03-27 : Bulat.Ziganshin@gmail.com : Public domain */

#include <stdio.h>

#define kCrcPoly     0xEDB88320
#define CRC_INIT_VAL 0 /* 0xFFFFFFFF for zip/rar/7-zip quasi-CRC */

// For rolling CRC hash demo
#define WINSIZE      48
#define TESTSIZE     3

// Fast CRC table construction algorithm
void FastTableBuild (unsigned CRCTable[256], unsigned seed)
{
unsigned i, j, r;

CRCTable[0]   = 0;
CRCTable[128] = r = seed;
for (i=64; i; i/=2)
CRCTable[i] = r = (r >> 1) ^ (kCrcPoly & ~((r & 1) - 1));

for (i=2; i<256; i*=2)
for (j=1; j<i; j++)
CRCTable[i+j] = CRCTable[i] ^ CRCTable[j];
}

#define BYTE (unsigned char)

void FastTableBuild2 (unsigned CRCTable[256], unsigned FastCRCTable[256])
{
unsigned i,c,x,y;

for(c=0;c<256;c++)
{
x=0;
y=0;
x=CRCTable[c]^(x>>8);
for(i=0;i<WINSIZE;i++)
{
x=CRCTable[(unsigned char) x]^(x>>8);
y=CRCTable[(unsigned char) y]^(y>>8);
}
FastCRCTable[c]=x^y;
}
}

#define init_CRC()                  CRC_INIT_VAL
#define update_CRC(crc,CRCTable,c)  (CRCTable[((crc)^(c)) & 0xFF] ^ ((crc)>>8))
#define finish_CRC(crc)             ((crc) ^ CRC_INIT_VAL)

unsigned calcCRC (unsigned char *buffer, unsigned len, unsigned CRCTable[256])
{
unsigned crc = init_CRC(), i;
for (i=0; i<len; i++)
crc = update_CRC(crc,CRCTable,buffer[i]);
return finish_CRC(crc);
}

int main()
{
unsigned i, CRCTab[256], RollingCRCTab[256], crc1, crc2;

// Fast CRC table construction
FastTableBuild (CRCTab, kCrcPoly);

// Fast table construction for rolling CRC
FastTableBuild2 (CRCTab, RollingCRCTab);

// Example of rolling CRC calculation and unit test simultaneously
unsigned char buffer[WINSIZE+TESTSIZE];
for (i=0; i<WINSIZE+TESTSIZE; i++)
buffer[i] = 11 + i*31 + i/17;   // random :)

// Let's calc CRC(buffer+TESTSIZE,WINSIZE) in two ways
crc1 = calcCRC(buffer+TESTSIZE,WINSIZE,CRCTab);
crc2 = calcCRC(buffer,WINSIZE,CRCTab);
for (i=0; i<TESTSIZE; i++)
crc2 = update_CRC(crc2,CRCTab,buffer[WINSIZE+i]) ^ RollingCRCTab[buffer[i]];
printf("%08x and %08x %s\n", crc1, crc2, crc1==crc2? "are equal":"ARE NOT EQUAL!");

return 0;
}```
maybe in your real world code, you already know the value of RollingCRCTab[128] for your specific window size. doing it first the slow way and then the fast way doesn't really make sense, right?

Most likely the following code from your original posting would be even faster than my FastTableBuild2 and correctly use the WINDOW_SIZE:
Code:
```for (i=0; i<256; i++)  {

unsigned crc = init_CRC();

crc = update_CRC(crc,CRCTab,i);

for (j=0; j<WINSIZE; j++)

crc = update_CRC(crc,CRCTab,0);

RollingCRCTab[i] = finish_CRC(crc);

//    printf("+%02x %08x %08x\n",i,r, (r & 1) ? (r>>1)^kCrcPoly : (r>>1));

}```
but your code provided a good example and the table creation shouldn't be a performance problem anyways as it only happens once usually.

7. it's performed in the following way:

```
// Fast CRC table construction
FastTableBuild (CRCTab, kCrcPoly);

// Rolling CRC table construction
i=128;
{
unsigned crc = init_CRC();
crc = update_CRC(crc,CRCTab,i);
for (j=0; j<WINSIZE; j++)
crc = update_CRC(crc,CRCTab,0);
RollingCRCTab[i] = finish_CRC(crc);
}

// Fast table construction for rolling CRC
FastTableBuild (RollingCRCTab, RollingCRCTab[128]);
```

you need only one value FastCRCTab[128] aka RollingCRCTab[128] and remaining values are calculated based on it. it's why my code is 256 times faster than your - the first loop is the same (y is always 0 in your code), but i execute it only once with i=128. the first cycle can also be made faster using fast power algorithm (it's just kCrcPoly**WINSIZE in the Galois field). there is google crc library that makes a lot of amazing things with those crcs

i'm sorry - this code is really hard to follow. you may download real working SREP code, that includes CrcRollingHash class, at http://freearc.org/research/SREP39.htm

8. thx for the enlightenment. this way you don't have to calc the entire rolling CRC table but instead only the element needed for your fast algorithm. cool!

9. 500% faster?!?!?!!!?!
Nice job, then!

10. i got the question how to build rolling CRC tables for non-zero CRC_INIT_VAL. unfortunately, it can't use fast algorithm. instead use slow one:
Code:
```void BuildRollingCRCTable (const unsigned CRCTable[256], unsigned RollingCRCTable[256])
{
unsigned i,c,x,y;
for(c=0;c<256;c++)
{
x = init_CRC();
y = init_CRC();
x = update_CRC(x,CRCTable,c);
for(i=0;i<WINSIZE;i++)
{
x = update_CRC(x,CRCTable,0);
y = update_CRC(y,CRCTable,0);
}
RollingCRCTable[c] = finish_CRC(x) ^ finish_CRC(y);
}
}```
of course, "y" can be calculated just once. if you have large WINSIZE (>=512) you can speedup calculation of "x" using power_in_galois_field function, implemented f.e. in the google crc library

Alternatively, you can use IncWinTab_Init from Eugene Shelwien's fma-diff rolling CRC implementation

11. Originally Posted by Bulat Ziganshin
i got the question how to build rolling CRC tables for non-zero CRC_INIT_VAL
Thank you so much! It works perfectly.

12. I did some debugging on the code here to get it to work for the CRC_INIT_VAL != 0 case. http://github.com/BartMassey/rolling-crc . Thanks huge for providing this codebase, it has been really helpful.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•