Is there any theory or experiments that justify choice of linear combinations of pixel neighborhood used in PAQ8px image models?

Currently, I am trying to implement part of im8bit model of PAQ8px in hardware (FPGA) for grayscale image compression. I am curios to understand, how these combinations were selected. Is it worth to run any tests, to look for any better sets of these linear combinations or not? Maybe I can use less number of these combinations and achieve similar results, thus saving hardware resources of FPGA. There is a lot to think about.