Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Does AAC use *inter-frame* compression? (Read 6814 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Does AAC use *inter-frame* compression?

Hello.

As the title says, I would like know whether AAC uses inter-frame compression, i.e. does the decoding of frame N use/require any data from frame (N-1), similar to the "reservoir" feature of MP3 ???

Or in other words: If I take an existing AAC stream, can the frames be reordered, or can "foreign" frames (with matching sample-rate, channel-count and AAC profile, of course) be inserted, without corrupting the stream ???

For testing purposes, I wrote a little program, which takes two AAC files (ADTS format) as input, reads these files frame by frame, and finally dumps all frames into a single file, in an interleaved way.

Yes, the result will sound very strange. Anyway, as long as I use two LC-AAC files, created by FAAC, the "interleaved" file decodes fine. But if I use two HEv2-AAC files, created by libfdk_aac, I get millions of decoding errors by FFmpeg! 

Note that I only interleaved compatible files, i.e. files with matching sample-rate, channel-count and AAC profile. Also, the two HEv2-AAC files decode flawlessly on their own in FFmpeg, but not after they had been interleaved...

At a first glance, this looks like LC-AAC does not use inter-frame compression, but HEv2-AAC does. I suspect SBR could have something to do with this. But I might be wrong here 

Best Regards,
MuldeR

Does AAC use *inter-frame* compression?

Reply #1
I think HE-AAC uses the bit reservoir for the SBR data.

From http://www.aes.org/events/118/papers/session.cfm?code=Z3:
Quote
Bit Reservoir Design for HE-AAC—Chi-Min Liu, Li-Wei Chen, Han-Wen Hsu, Wen-Chieh Lee, National Chiao-Tung University - Hsin-Chu, Taiwan

High Efficiency AAC (HE-AAC) has included the Spectral Band Replication (SBR) in combination with AAC to achieve high audio quality at bit rates lower than 96 kbits per second. SBR reconstructs high frequency signal through replicating the low frequency parts. The bits allocated to AAC encoder module and SBR module decides the quality and compression efficiency. In the past, we have designed the bit reservoir for AAC to reserve and predict the bits necessary for each time frame. The bit reservoir should be extended for HE-AAC especially for the SBR module. This paper considers the design of the bit reservoir for the HE-AAC. The efficiency of bit reservoir is verified through extensive objective tests.


By the way, MP3 and AAC both use the MDCT, so there is always an inter-frame dependency, regardless of bit reservoir. Essentially, some samples from the leading edge of each frame get crossfaded with some samples from the trailing edge of the previous one. Thus even if there are no structural dependencies (i.e. bit reservoir) that would introduce decoding errors in your experiment, the foreign frames are affecting part of the sound obtained from the native ones, and vice-versa.

Before splicing MP3s, it's advisable to run them through MP3packer with the -r and -b 320 switches. This will spread the data into the largest frames possible (320 kbps) with minimal use of the bit reservoir. Unfortunately, it doesn't work on AAC-LC.

Does AAC use *inter-frame* compression?

Reply #2
At a first glance, this looks like LC-AAC does not use inter-frame compression, but HEv2-AAC does. I suspect SBR could have something to do with this. But I might be wrong here

I think you are perfectly right. SBR (and probably also Parametric Stereo) use inter-frame techniques for lossless coding of the side-information.

In (HE-)AAC the bit-reservoir only exists on the encoder side to allocate bits for each frame (which can be of varying size, unlike in MP3), so it does not introduce frame dependencies at the decoder.

Chris
If I don't reply to your reply, it means I agree with you.

Does AAC use *inter-frame* compression?

Reply #3
Thank you for reply!

I think HE-AAC uses the bit reservoir for the SBR data.

From http://www.aes.org/events/118/papers/session.cfm?code=Z3

To my understanding, correct me if I'm wrong, in AAC we use variable-size frames anyway and the "reservoir" only exists for CBR streams. Also, unlike MP3, the "reservoir" does not actually store data that belongs to one frame in another frame. Instead, it's simply a formula that allows the frame sizes to vary as needed, within certain restrictions. If the frame sizes vary no more than the "reservoir" formula allows, we can still call this a valid CBR stream. Otherwise, it has to be VBR.

By the way, MP3 and AAC both use the MDCT, so there is always an inter-frame dependency, regardless of bit reservoir.

At least this doesn't seem to be a noticeable problem for "real world" audio. I have been splicing MP3 files in this way for a long time now. And, as long as the reservoir is removed beforehand, I didn't note any issues. That means neither did FFmpeg throw any decoding errors nor did any of my "MP3 validation" tools complain. So I think there are no decoding errors. If at all, it simply would not sound "optimal" at the junction point. But there's probably no way around that...

Before splicing MP3s, it's advisable to run them through MP3packer with the -r and -b 320 switches. This will spread the data into the largest frames possible (320 kbps) with minimal use of the bit reservoir.

Didn't know about "MP3packer", thanks for the pointer. Anyway, I have implemented the removal of the "reservoir" for MP3 myself already. I also use "--nores" with LAME at the highest bitrates, to avoid problems regarding the removal of reservoirs.

Does AAC use *inter-frame* compression?

Reply #4
At a first glance, this looks like LC-AAC does not use inter-frame compression, but HEv2-AAC does. I suspect SBR could have something to do with this. But I might be wrong here

I think you are perfectly right. SBR (and probably also Parametric Stereo) use inter-frame techniques for lossless coding of the side-information.

Thanks a lot for confirming! And thanks for the reminder on Parametric Stereo!

Do you have any more specific information on this? May it be possible to get rid of this inter-frame dependency, similar to how we can strip the "reservoir" from MP3 frames?

BTW: In the specs I found that SBR stores its data in the "sbr_extension_data" struct. But I found nothing that would clearly indicate an inter-frame dependency...

In (HE-)AAC the bit-reservoir only exists on the encoder side to allocate bits for each frame (which can be of varying size, unlike in MP3), so it does not introduce frame dependencies at the decoder.

This is what I was thinking too. Thus splicing AAC frames is supposed to work fine, in theory. However, it appears that SBR (and/or PS) will introduce some "reservoir" in the MP3 sense 

Regards.


EDIT:

I think this whole problem may be related to the "ps_data" used by Parametric Stereo:
Code: [Select]
ps_data()
{
   if (enable_ps_header) {
      if (enable_iid) {
         iid_mode
         nr_iid_par = nr_iid_par_tab[iid_mode]
         nr_ipdopd_par = nr_ipdopd_par_tab[iid_mode]
      }
      if (enable_icc) {
         icc_mode
         nr_icc_par = nr_icc_par_tab[icc_mode]
      }
      enable_ext
   }
  
   [....]
}


If the "enable_ps_header" bit is not set, there's no new PS header data and, according to the spec, the last configuration persists in that case 

So it could happen that the frame directly after the junction point doesn't have this bit set and thus will no longer decode correctly correctly, because it uses the configuration of the preceding frames - which have been exchanged ?!

If that is the problem, possible workaround is to enforce that "enable_ps_header"is always TRUE, even if configuration didn't change. Would probably require a hacked encoder...

Does AAC use *inter-frame* compression?

Reply #5
So it could happen that the frame directly after the junction point doesn't have this bit set and thus will no longer decode correctly correctly, because it uses the configuration of the preceding frames - which have been exchanged ?!

If that is the problem, possible workaround is to enforce that "enable_ps_header"is always TRUE, even if configuration didn't change. Would probably require a hacked encoder...

Yes, I think that's the problem. AFAIR you can use time-differential coding of SBR and PS side-information. Exchanging the SBR header then gives you wrong "history" for following frames, corrupting the output.

As you said, to avoid this you have to tell the encoder(s) to write an SBR header into every frame (which greatly increases the side-information rate and is, in general, not recommended).

Chris
If I don't reply to your reply, it means I agree with you.