
Originally Posted by
rhulcomputerscience
Some files, such as programming source code files, if they are without errors, are restricted to the grammar of that programming language.
A grammar gives a precise, yet easy-to-understand, syntactic specification and therefore restriction of a programming language. This restriction should provide a beneficial quality for compression algorithms. Since as well as compiling probabilities of next letters, it could be possible to use the grammar to restrict the characters further.
Has any research been conducted in this topic? I am very interested.
Kind regards.