Results 1 to 8 of 8

Thread: Content-aware compression; Clari-Fi by Harman Kardan

  1. #1
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    279
    Thanks
    109
    Thanked 51 Times in 35 Posts

    Content-aware compression; Clari-Fi by Harman Kardan

    Hi all,

    Typical compressors are what I call naïve compressors – they don't know anything about the content they're compressing, and all content is approached in exactly the same way. I've wondered if it would be possible to improve the output quality of audio compression in particular by using metadata (or just data) that is specific to the content being compressed. For example, if we were compressing Born This Way by Lady Gaga, this is a song – a data stream – that we already know a great deal about. But every time we compress/encode it, we and our codecs act as though we know nothing about it, like it's the first time we're seeing this data stream. Instead, I imagine that we could have a metadata or config file for that song. It could tell the compressor any number of things that would help it produce a better result. e.g. At a high level, it could be something like "Don't cut off the low end of the drum set from 2:02 to 2:23...)

    I'm just making that example up. I don't know enough about how MP3, AAC, and Orbis work to know if song-specific metadata/config files could improve their output. Obviously, the encoders would have to be modified to process such metadata. As for the metadata/configs, I imagine they would be shareable and people might compete to come up with the best optimizations.

    I was reminded of this idea when looking at some specs for some new cars and I noticed an option called "Clari-Fi™ Music Restoration Technology". I looked it up and its a Harman Kardon technology – here's the page.

    One of the things they say about it is:

    Once identified, Clari-Fi intelligently corrects waveform deficiencies based on existing music information and audio source quality.
    The part about "existing music information" made me think that it's got metadata on potentially thousands of popular tracks, and that it uses that metadata to improve the playback (it either stores this metadata locally on flash memory, or downloads it as soon as it recognizes the song). Does anyone know anything about how Clari-Fi works? I didn't see much detail on the website. It sounds like the flip side of my idea, where instead of using metadata during the encoding phase to produce better sound, it uses metadata during decode/playback to produce better sound.

    I actually thought of that before, but not for audio. I thought maybe we could improve video streaming quality if we had metadata about the content that could be applied during playback.

    I think Netflix is doing some kind of content-aware compression. The quality they achieve at their bitrates is basically magic – we're not supposed to be able to get H.264 down to such low bitrates for HD quality. They're at less than 6 Mb/sec while Blu-ray is over 30 Mbit/sec and broadcast HD is in the 20 Mbit/sec ballpark. They've blogged about how they tailor the compression to the content, but I'm not sure if they've gone into detail.

    Those of you who are more familiar with audio compression and the MP3, AAC, and Opus formats – is what I'm describing possible? Could encoders for those formats be modified to fruitfully use track-specific metadata, improving audio quality?

    (Maybe this is what Apple's Mastered for iTunes program does. I'm not sure. If you haven't heard of it, it's not a program as in a software application – it's a program as in a coordinated effort with some record companies, artists, etc. to improve the quality of the songs sold on iTunes. I think only a minority of the songs on iTunes are "Mastered for iTunes" at this point. The bitrate doesn't change – it's still 256 kbps AAC.)

  2. #2
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,488
    Thanks
    26
    Thanked 130 Times in 100 Posts
    I've tried to find a description that is not a marketing gibberish but failed. I think they have some sort of pre-trained model which compares low quality sounds with a database of badly encoded sounds and swaps them with high quality ones. If that's the approach then there's a problem with fidelity - a badly encoded sound can be made from multiple high quality sounds. We can only make a guess as to which sound it corresponds.

  3. #3
    Member
    Join Date
    Feb 2015
    Location
    United Kingdom
    Posts
    170
    Thanks
    27
    Thanked 73 Times in 43 Posts
    Quote Originally Posted by SolidComp View Post
    Typical compressors are what I call naïve compressors – they don't know anything about the content they're compressing, and all content is approached in exactly the same way.
    True, however some compressors have built in systems to analyze the data and try to figure out the best representation. FreeArc Next, Nanozip, Winrar, GLZA, Razor (from the looks of it), and sometimes 7zip based on file extension use some form of data analyzing. They all look for channels and either reorder them or apply linear prediction and dedicated audio filters or some form thereof.

    As for this Clari-Fi their description of the technology doesn't give any specifics, to me it's just marketing, but my guess is it's just an additive acoustic model. Basically an audio predictor and de-noiser, sort of like what your vehicle's built in equalizers are made for.

    Once quality is removed it cannot be restored, but it can be roughly estimated; some components in a lossy stream may be quantized more than others. For example if a drum occurs at every half beat and a snare at every beat then we can bring out the quality of the drum over top of the next snare in an attempt to minimize any distortion caused by the mixing of the two instruments.

    What you are thinking of is likely doable, using pattern recognition tuned for waveform data, but it's a lot of work.

  4. #4
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Switzerland
    Posts
    403
    Thanks
    278
    Thanked 283 Times in 149 Posts
    "It restores what was lost" - it's a marketing hype.
    It tries to modify the waveform so it sounds more natural by "simply" tweaking the audio.

    From http://www.cieri.net/Documenti/Altri...0(English).pdf

    CLARIFI analyzes the audio signal in real time and corrects waveform deficiencies based on existing audio information and an in-depth knowledge of codecs and psychoacoustics.
    CLARIFI’s intelligent algorithm scales the amount of restoration it provides based on source bitrate.
    - BANDWIDTH EXTENSION: Adds high-frequency bandwidth extension for more clarity and detail.
    - FREQUENCY FILL: Fills unnatural frequency holes in the non-dominant signals that are caused by compression.
    - REVERB FILL: Restores the natural reverb tail for a more natural, smooth and accurate reverb signature.
    - TRANSIENT ENHANCEMENT: Sharpens the attack and decay of transient signals such as percussion, and restores their original peaky quality.
    (See the images in the PDF.)

  5. #5
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Switzerland
    Posts
    403
    Thanks
    278
    Thanked 283 Times in 149 Posts
    Quote Originally Posted by Lucas View Post
    What you are thinking of is likely doable, using pattern recognition tuned for waveform data, but it's a lot of work.
    And Harman Kardon would probably hire you. You would do the work, they would do the marketing.

  6. #6
    Member
    Join Date
    Feb 2015
    Location
    United Kingdom
    Posts
    170
    Thanks
    27
    Thanked 73 Times in 43 Posts
    Quote Originally Posted by Gotty View Post
    And Harman Kardon would probably hire you. You would do the work, they would do the marketing.
    Thanks! But I've already been snagged up to build videogame streaming solutions .

  7. #7
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    279
    Thanks
    109
    Thanked 51 Times in 35 Posts
    Quote Originally Posted by Piotr Tarsa View Post
    I've tried to find a description that is not a marketing gibberish but failed. I think they have some sort of pre-trained model which compares low quality sounds with a database of badly encoded sounds and swaps them with high quality ones. If that's the approach then there's a problem with fidelity - a badly encoded sound can be made from multiple high quality sounds. We can only make a guess as to which sound it corresponds.
    You're saying that any given badly encoded sound can be the result of any of several different high-quality sounds? Is this one-to-many correspondence true of video compression as well?

    After reading about Google's deep learning or ANN research with photos and video, I wondered if a better approach was possible:

    The problem is grainy, low-quality video camera footage of a criminal. I thought maybe if you took other video from the same camera, like video of employees, then took high quality video or photos of the employees (from the same angles, lighting, and so forth), you could then see exactly how the security camera footage degraded the imagery. Then, maybe there would be a way to train a system to reverse the process, where given the input of grainy footage, it could restore detail and other qualities, giving you a much better image of the criminal.

    That would depend on a given grainy image or video snippet being unique to a given object (in this case, a human face). But now I wonder if several different faces could produce the exact same grainy footage. It seems unlikely. The one-to-many fidelity problem with compressed audio wouldn't apply here, would it?

  8. #8
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Switzerland
    Posts
    403
    Thanks
    278
    Thanked 283 Times in 149 Posts
    The human brain with its built-in neural network is an amazing image, video and audio enhancement tool. May I say, one of the best?
    It has seen/heard plenty of high definition materials. It uses this information (retrieved from its memory and using its imagination ) for enhancing lower quality images, videos and audios. We can understand each other in a noisy environment, our brain can even "restore" (imagine) completely unrestorable parts of a speech. But that's not the original. It's just the product of our sophisticated imagination (which is based on previous experiences).

    We have excellent image enhancement and audio enhancement software tools.
    They, too can't restore the *original* but they can remove (hide) some of the artifacts. By doing so they do help the brain to pick up more details. That's the trick.
    As SolidComp also feels the result must be handled cautiously. This capability of the brain (and any other tool) is *cheatable* - funny optical and audio illusions emerge from this fact.

    Edit:
    The desired level of audio enhancement is different for each person.
    Some like it softer, some like it sharper, some like deeper sounds, others like a more balanced sound stage. Some audiophiles like the pure original sound, even if it is flat (and not really enjoyable for most of us).
    Edit:
    A little noise and I can't understand the person talking to me. So I'd need more enhancement (my brain needs more help in a noisy environment), but most of the people I know (especially the ladies) would need no enhancement at all. That's the difference of our brains.
    So it is difficult (impossible) to create an enhancement tool that works perfectly for any individual.
    Last edited by Gotty; 28th May 2018 at 18:25. Reason: addition

  9. Thanks:

    khavish (28th May 2018)

Similar Threads

  1. New video, image and audio codecs you might not be aware of
    By spaceship9876 in forum Data Compression
    Replies: 15
    Last Post: 26th February 2018, 13:36
  2. Replies: 0
    Last Post: 14th August 2014, 20:37
  3. Information content of human genome
    By Matt Mahoney in forum The Off-Topic Lounge
    Replies: 52
    Last Post: 17th December 2013, 08:10
  4. Online Content Management Services
    By Karhunen in forum The Off-Topic Lounge
    Replies: 2
    Last Post: 9th February 2012, 23:57

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •