Hi all,
Typical compressors are what I call naïve compressors – they don't know anything about the content they're compressing, and all content is approached in exactly the same way. I've wondered if it would be possible to improve the output quality of audio compression in particular by using metadata (or just data) that is specific to the content being compressed. For example, if we were compressing Born This Way by Lady Gaga, this is a song – a data stream – that we already know a great deal about. But every time we compress/encode it, we and our codecs act as though we know nothing about it, like it's the first time we're seeing this data stream. Instead, I imagine that we could have a metadata or config file for that song. It could tell the compressor any number of things that would help it produce a better result. e.g. At a high level, it could be something like "Don't cut off the low end of the drum set from 2:02 to 2:23...)
I'm just making that example up. I don't know enough about how MP3, AAC, and Orbis work to know if song-specific metadata/config files could improve their output. Obviously, the encoders would have to be modified to process such metadata. As for the metadata/configs, I imagine they would be shareable and people might compete to come up with the best optimizations.
I was reminded of this idea when looking at some specs for some new cars and I noticed an option called "Clari-Fi™ Music Restoration Technology". I looked it up and its a Harman Kardon technology – here's the page.
One of the things they say about it is:
The part about "existing music information" made me think that it's got metadata on potentially thousands of popular tracks, and that it uses that metadata to improve the playback (it either stores this metadata locally on flash memory, or downloads it as soon as it recognizes the song). Does anyone know anything about how Clari-Fi works? I didn't see much detail on the website. It sounds like the flip side of my idea, where instead of using metadata during the encoding phase to produce better sound, it uses metadata during decode/playback to produce better sound.Once identified, Clari-Fi intelligently corrects waveform deficiencies based on existing music information and audio source quality.
I actually thought of that before, but not for audio. I thought maybe we could improve video streaming quality if we had metadata about the content that could be applied during playback.
I think Netflix is doing some kind of content-aware compression. The quality they achieve at their bitrates is basically magic – we're not supposed to be able to get H.264 down to such low bitrates for HD quality. They're at less than 6 Mb/sec while Blu-ray is over 30 Mbit/sec and broadcast HD is in the 20 Mbit/sec ballpark. They've blogged about how they tailor the compression to the content, but I'm not sure if they've gone into detail.
Those of you who are more familiar with audio compression and the MP3, AAC, and Opus formats – is what I'm describing possible? Could encoders for those formats be modified to fruitfully use track-specific metadata, improving audio quality?
(Maybe this is what Apple's Mastered for iTunes program does. I'm not sure. If you haven't heard of it, it's not a program as in a software application – it's a program as in a coordinated effort with some record companies, artists, etc. to improve the quality of the songs sold on iTunes. I think only a minority of the songs on iTunes are "Mastered for iTunes" at this point. The bitrate doesn't change – it's still 256 kbps AAC.)