Hi,
I made a small modification to paq8l which seems to slightly improve its compression rate. When constructing the mixer I initialized the neural network weights to pseudorandom values (with a constant seed) instead of a fixed value.
Empirical testing seems to indicate that using random weights consistently outperforms paq8l. However, the improvement is very small (usually around 0.001 bits per byte of the original file). Since the change has no significant computational cost, I thought it would be worth posting about in this forum. Newer versions of paq8 based on paq8l might benefit from this change.
Here are some results on the Calgary corpus files (showing cross entropy rate):
Code:
File paq8l random weights difference
bib 1.4969681203 1.4956374574 0.001330663
book1 2.0057252441 2.0055120836 0.0002131605
book2 1.5953075026 1.5950521764 0.0002553262
geo 3.4345631905 3.4326454519 0.0019177386
news 1.9057344947 1.9053439502 0.0003905445
obj1 2.7735791883 2.7624077079 0.0111714805
obj2 1.4549899071 1.4544100564 0.0005798507
paper1 1.9654265344 1.9622643897 0.0031621447
paper2 1.9904600822 1.9884160096 0.0020440726
pic 0.3508760218 0.3506340518 0.00024197
progc 1.9157351214 1.9121361326 0.0035989887
progl 1.1831317743 1.1815620494 0.0015697249
progp 1.1507982053 1.1487394746 0.0020587307
trans 0.9916905984 0.9899215665 0.0017690319