
Originally Posted by
SolidComp
I've also wondered about combining raster and vector approaches in the same format, since it's clear that some images would benefit much more from one or the other.
I also want to see a format that explicitly accounts for image sets in addition to standalone images, since a lot of groups of images on webpages, social media, etc. feature the same settings and subjects and have lots of encodable overlap. It sounds sort of like video encoding, but it's going to be much less similar and more complicated than adjacent frames in video.
The DNN researchers at Google have done some impressive work with images and image recognition, and I wonder if that could be applied to an approach that uses reference images for specific people, places, logos, etc. and does something like delta encoding for a bunch of images that feature those people, places, logos, etc. At first blush, this seems unlikely to work out – to lead to significant savings – but I think my initial intuition is wrong and that it's just a matter of much better algorithms and constructs.
Another idea I had was what sorts of things we could do at the time of image acquisition – when we're composing and taking photographs – that would help achieve smaller file sizes at a given level of quality. I've never read anyone talk about this before, but there are so many things going on in, say, headshots, product photography, etc. in terms of backgrounds, lighting, colors, color gradients, angles, reflectance, etc. It might be possible to subtly manipulate these things in ways that lead to surprisingly smaller JPEGs, webps, or PIKs without significantly impacting (or even improving) how humans perceive the image.