Hi, I just stumbled across this in my arxiv feed (https://arxiv.org/abs/2008.04558) and was wondering about the approach.
From what I understand, the idea is to use the visually lossless mode of JPEG XS to produce a base image, and then use another JPEG variant to encode the residual image (suitably rescaled).
My question: Does it make sense to use an algorithm designed for images on a residual image? It seems like the residual will not match the assumptions of most algorithms, indeed, it is almost uniquely designed not to match any assumptions about natural or computer-generated images. Am I missing something obvious?