It's funny indeed. LILY and paq8gen are always head to head. So following that pattern, now paq8gen must go under 600K. OK. Don't know how on earth it's gonna be, yet. Feel free to add your magic there.
I'm not that serious about squeeze the last bits from the exe. (You can see it's not my priority.) Pushing compression further is what I'm into. When enhancing paq8gen I'm also checking my changes against two other corpus: what I call the small DNA corpus (https://encode.su/threads/2105-DNA-Corpus) and the large one: https://tinyurl.com/DNAcorpus (referenced from GeCo3: https://github.com/cobilab/geco3). So for me it's a threefold challenge.
Paq8gen is far from a proper genomic sequence compressor, yet: it should support a line-unwrapping transform and reordering sequence names and sequences automatically for example. Or look for palindromes... Or anything I haven't even heard of
If you've got ideas and have some time - feel free to join and let's make it a good entropy estimator in the genomic-compression field.